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This  is  a  collection  of  five  papers  that  concern  applications  of  ideas  from  proof  theory 
to  problems  iir  the  semantics  of  types  and  concurrency.  In  the  order  they  are  arranged, 
the  articles  are 

1.  Inheritance  as  implicit  coercion.  Val  Breazu-Tannen,  Thierry  Coquand,  Carl  A. 
Gunter,  and  Andre  Scedrov.  Information  and  Computation,  vol.  93  (1991), 
pp.  172-221  by  invitation  from  the  papers  presented  at  the  1989  Symposium  on 
Logic  in  Computer  Science. 

2.  Computing  with  coercions,  Val  Breazu-Tannen,  Carl  A.  Gunter,  and  Andre  Sce¬ 
drov.  In:  Conference  on  Lisp  and  Functional  Programming,  edited  by 
M.  Wand,  Nice,  France.  .July  1990.  pp.  44-60. 

3.  Nets  as  tensor  theories  (preliminary  report),  Carl  A.  Gunter  and  Vijay  Gehlot. 
In:  Conference  on  Application  and  Theory  of  Petri  Nets,  edited  by  G.  De 
Michelis,  Bonn,  F.R.G..  June  1989,  pp.  174-191. 

4.  Normal  process  representatives.  Vijay  Gehlot  and  Carl  .4.  Gunter.  In:  Sympo¬ 
sium  on  Logic  in  Computer  Science,  edited  by  J.  Mitchell,  IEEE  Computer 
Society  Press,  Philadelphia.  Pennsylvania,  June  1990.  pp.  200-207. 

•5.  Reference  counting  as  a  computational  interpretation  of  linear  logic,  Jawahar  Chir- 
imar,  Carl  A.  Gunter,  and  Jon  Riecke.  To  appear  in:  Journal  of  Functional 
Programming. 

For  further  reading  on  the  coherence  issues  studied  in  the  first  two  articles,  one  may 
consult  the  MIT  Press  collection  Theoretical  Aspects  of  Object  Oriented  Pro¬ 
gramming  Languages:  Types,  Semantics,  and  Language  Design,  edited  by 
John  C.  Mitchell  and  Carl  .4.  Gunter.  Further  details  about  the  third  and  fourth  topics 
can  be  found  in  Vijay  Gehlot’s  1992  University  of  Pennsylvania  Ph.D.  dissertation,  A 
Proof-Theoretic  Approach  to  Semantics  of  Concurrency. 
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INHERITANCE  AS  IMPLICIT  COERCION  ^ 


Val  Breazu-Tannen  Thierry  Coquand  Carl  A.  Gunter  Andre  Scedroi? 


Abstract.  We  present  a  method  for  providing  semantic  interpretations  for  languages  with  a 
type  system  featuring  inheritance  polymorphism.  Our  approach  is  illustrated  on  an  extension  of 
the  language  Fun  of  Cardelli  and  Wegner,  which  we  interpret  via  a  translation  into  an  extended 
polymorphic  lambda  calculus.  Our  goal  is  to  interpret  inheritances  in  Fun  via  coercion  functions 
which  are  definable  in  the  target  of  the  translation.  Existing  techniques  in  the  theory  of  semantic 
domains  can  be  then  used  to  interpret  the  extended  polymorphic  lambda  calculus,  thus  providing 
many  models  for  the  original  language.  This  technique  makes  it  possible  to  model  a  rich  type 
discipline  which  includes  parametric  polymorphism  and  recursive  types  as  well  as  inheritance. 

A  central  difficulty  in  providing  interpretations  for  explicit  type  disciplines  featuring  inheritance 
in  the  sense  discussed  in  this  paper  arises  from  the  fact  that  programs  can  type-check  in  more 
than  one  way.  Since  interpretations  follow  the  type-checking  derivations,  coherence  theorems 
are  required:  that  is,  one  must  prove  that  the  meaning  of  a  program  does  not  depend  on  the 
way  it  was  type-checked.  The  proof  of  such  theorems  for  our  proposed  interpretation  are  the 
basic  technical  results  of  this  paper.  Interestingly,  proving  coherence  in  the  presence  of  recursive 
types,  variants,  and  abstract  types  forced  us  to  reexamine  fundamental  equational  properties 
that  arise  in  proof  theory  (in  the  form  of  commutative  reductions)  and  domain  theory  (in  the 
form  of  strict  vs.  non-strict  functions). 


1  Introduction 

In  this  paper  we  will  discuss  an  approach  to  the  semantics  of  a  particular  form  of  inheritance  which 
has  been  promoted  by  John  Reynolds  and  Luca  Cardelli.  This  inheritance  system  is  based  on  the 
idea  that  one  may  a.xiomatize  a  relation  <  between  type  expressions  in  such  a  way  that  whenever 
the  inheritance  judgement  s  <  t  is  provable  for  type  expressions  s  and  t,  then  an  expression  of  type 
s  can  be  “considered  as”  an  expression  of  type  t.  This  property  is  expressed  by  the  inheritance 
rule  (sometimes  also  called  the  subsumption  rule),  which  states  that  if  an  expression  e  is  of  type  s 
and  s  <  t,  then  e  also  has  type  t.  The  consequences  from  a  semantic  point  of  view  of  the  inclusion 
of  this  form  of  typing  rule  are  significant.  It  is  our  goal  in  this  paper  to  look  carefully  at  what  we 
consider  to  be  a  robust  and  intuitive  approach  to  systems  which  have  this  form  of  inheritance  and 
examine  in  some  detail  the  semantic  implications  of  the  inclusion  of  inheritance  judgements  and 
the  inheritance  rule  in  a  type  discipline. 

Several  attempts  have  been  made  recently  to  express  some  of  the  distinctive  features  of  object- 
oriented  programming,  principally  inheritance,  in  the  framework  of  a  rich  type  discipline  which 
can  accommodate  strong  static  type-checking.  This  endeavor  searches  for  a  language  that  offers 
some  of  the  flexibility  of  object-oriented  programming  [GR83]  while  maintaining  the  reliability,  and 
sometimes  increased  efficiency  of  programs  which  type-check  at  compile-time  (see  [BBG88]  for  a 
related  comparison). 

'Appears  in  Information  and  Computation  vol.  93  (1991),  pp.  172-221. 

^Author’s  addresses.  Breazu-Tannen  and  Gunter:  Department  of  Computer  and  Information  Sciences,  University 
of  Pennsylvania,  Philadelphia  PA  19104,  USA.  Coquand:  INRIA,  Domaine  de  Voluceau,  78150  Rocquencourt,  France. 
Scedrov:  Department  of  Mathematics,  University  of  Pennsylvania,  Philadelphia  PA  19104,  USA. 
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A  type  system  of  Reynolds  introduced  in  [ReySO]  captured  some  basic  intuitions  about  in¬ 
heritance  relations  between  familiar  type  expressions  built  from  records,  variants  (sums)  and 
higher  types.  A  language  which  exploited  this  form  of  type  discipline  was  developed  by  Cardelli 
in  [Car84,  Car88a]  where  the  first  attempt  was  made  to  describe  a  rigorous  form  of  mathematical 
semantics  for  such  a  system.  His  approach  uses  ideals  and  it  is  shown  that  the  type  discipline  is 
consistent  with  the  semantics  in  the  sense  that  type-checking  is  shown  to  “prevent  type  errors”.  Sub¬ 
sequent  work  has  aimed  at  combining  inheritance  with  richer  type  disciplines,  in  particular  featuring 
parametric  polymorphism.  One  direction  of  research  [Wan87,  JM88,  OB88,  Sta88],  has  investigated 
expressing  inheritance  and  type  inference  mechanisms,  similarly  to  the  way  in  which  parametric 
polymorphism  is  expressed  in  ML-like  languages.  Another  direction  of  research  investigates  express¬ 
ing  inheritance  through  explicit  subtyping  mechanisms  which  are  part  of  the  type-checking  systems, 
such  as  in  Cardelli  and  Wegner’s  language  Fun  [CW85]  and  further  work  [Car88b,  Car89a,  CM89]. 
Cardelli  and  Wegner  sketch  a  model  for  Fun  based  on  ideals.  An  extensional  model  for  Fun  was 
subsequently  described  by  Bruce  and  Longo  [BL88].  Their  model  interprets  inheritances  as  identity 
relations  between  partial  equivalence  relations  (PER’s).  Another  model  of  Fun,  using  the  interval 
interpretation  of  Cartwright  [Car85]  has  been  given  by  Martini  [Mar88].  In  Martini’s  semantics, 
inheritance  is  interpreted  as  a  form  of  inclusion  between  intervals.  This  model  also  includes  a 
general  recursion  operator  for  functions  (but  not  types). 

In  this  paper  we  present  a  novel  approach  to  the  problem  of  developing  a  simple  mathematical 
semantics  for  languages  which  feature  inheritance  in  the  sense  of  Reynolds  and  CardeUi.  The  form 
of  semantics  that  we  propose  wiU  take  a  significant  departure  from  the  characteristic  shared  by 
the  semantics  mentioned  above.  We  will  not  attempt  to  model  inheritance  as  a  binary  relation 
on  a  family  of  types.  In  particular,  our  interpretation  wiU  not  use  anything  like  an  inclusion 
relation  between  types.  Instead,  we  interpret  the  inheritance  relation  between  type  expressions  as 
indicating  a  certain  coercion  which  remains  implicit  in  instances  in  which  the  inheritance  is  used  in 
type-checking.  We  show  how  these  coercions  can  be  made  explicit  using  definable  terms  of  a  calculus 
without  inheritance,  and  thus  depart  from  the  “relational”  interpretation  of  the  inheritance  concept. 
Using  this  idea,  we  are  able  to  show  how  many  of  the  models  of  polymorphism  and  recursive  types 
which  have  no  relevant  concept  of  type  inclusion,  can  nevertheless  be  seen  as  models  for  a  calculus 
with  inheritance. 

We  illustrate  our  approach  on  the  language  Fun  of  Cardelli  and  Wegner  extended  with  recursive 
types  but,  the  kind  of  results  we  obtain  are  non-trivial  for  any  calculus  that  combines  inheritance, 
parametric  polymorphism,  and  recursive  types.  The  method  we  propose  proceeds  first  with  a  trans¬ 
lation  of  Fun  into  an  extended  polymorphic  lambda  calculus  with  recursive  types.  As  we  mentioned 
above,  this  translation  interprets  inheritances  in  Fun  as  coercion  functions  already  definable  in  the 
extended  polymorphic  lambda  calculus.  Then,  we  can  use  existing  techniques  for  modeling  poly¬ 
morphism  and  recursion  (such  as  those  described  in  [ABL86,  Gir86,  CGW87,  CGW89])  to  interpret 
the  extended  polymorphic  lambda  calculus,  thus  providing  models  for  the  original  language  with 
inheritance.  This  method  achieves  simultaneous  modeling  of  parametric  polymorphism,  recursive 
types,  and  inheritance.  In  the  process,  the  paradigm  “inheritance  as  definable  coercion”  proves 
itself  remarkably  robust,  which  makes  us  confident  that  it  will  apply  to  a  large  class  of  rich  type 
disciplines  with  inheritance. 

The  paper  is  divided  into  seven  sections.  Following  this  introduction,  the  second  section  pro¬ 
vides  some  general  examples  and  motivation  to  prepare  the  reader  for  the  technical  details  in  the 
subsequent  sections.  The  third  section  discusses  how  our  semantics  applies  to  a  calculus  SOURCE 
which  has  inheritance,  exponentials,  records,  generics  and  recursive  types.  We  show  how  this  is 
translated  into  a  calculus  TARGET  without  inheritance  and  state  our  results  about  the  coherence 
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of  the  translation.  We  hope  that  the  results  in  this  simpler  setting  will  help  the  reader  get  an  idea 
of  what  our  program  is  before  we  proceed  to  a  more  interesting  calculus  in  the  remainder  of  the 
paper.  The  fourth  section  is  devoted  to  developing  a  translation  for  an  expanded  calculus  which 
adds  variants.  Fundamental  equational  properties  of  variants  lead  us  to  develop  a  target  language 
which  has  a  type  of  coercions.  The  fifth  section,  which  contains  the  difficult  technical  results  of  the 
paper,  shows  that  our  translation  is  coherent.  In  the  sixth  section  we  discuss  mathematical  models 
for  the  full  calculus.  Since  most  of  the  work  has  already  been  done,  we  are  able  to  produce  many 
models  using  standard  domain-theoretic  techniques.  The  concluding  section  makes  some  remarks 
about  what  we  feel  has  been  achieved  and  what  new  challenges  stiff  need  to  be  confronted. 

2  Inheritance  as  implicit  coercion. 

A  simple  analogy  will  help  explain  our  translation-based  technique.  Consider  how  the  ordinary 
untyped  A-calculus  is  interpreted  semantically  in  such  sources  as  [ScoSO,  Mey82,  Koy82,  Bar84].  One 
begins  by  postulating  the  existence  of  a  semantic  domain  D  and  a  pair  of  arrows  D  —>■  (D  ^  D) 
and  (D  — >  D)  such  that  $  o  '5'  is  the  identity  on  D  D.  Certain  conditions  are  required 
of  £)  — >■  D  to  insure  that  “enough”  functions  are  present.  To  interpret  an  untyped  A-term,  one 
defines  a  translation  M  M*  on  terms  which  takes  an  untyped  term  M  and  creates  a  typed  term 
M*.  This  operation  is  defined  by  induction: 

•  for  a  variable,  x*  =  x:  D, 

•  for  an  application,  M{N)*  =  $(M*)(iV*)  and, 

•  for  an  abstraction,  (Ax.  M)*  =  ’F(Ax:£).  M*) 

(where  we  use  =  for  syntactic  equality  of  expressions).  For  example,  the  familiar  term 

A/.  (Ax.  /(xx))(Ax.  /(xx)) 


translates  to 


$('F(Ax:D.  $(/)($(x)(x))))(W(Ax:T».  $(/)($(x)(x))))). 

The  fact  that  the  latter  term  is  unreadable  is  perhaps  an  indication  of  why  we  use  the  former 
term  in  which  the  semantic  coercions  are  implicit.  Nevertheless,  this  translation  provides  us  with 
the  desired  semantics  for  the  untyped  term  since  we  have  converted  that  term  into  a  term  in  a 
calculus  which  we  know  how  to  interpret.  Of  course,  this  assumes  that  we  really  do  know  how  to 
provide  a  semantics  for  the  typed  calculus  supplemented  with  triples  such  as  Moreover, 

there  are  some  equations  we  must  check  to  show  that  the  translation  is  sound.  But,  at  the  end 
of  the  day,  we  have  a  simple,  intuitive  explanation  of  the  interpretation  of  untyped  A-terms  based 
on  our  understanding  of  a  certain  simply  typed  A-theory.  In  this  paper  we  show  how  a  similar 
technique  may  be  used  to  provide  an  intuitive  interpretation  for  inheritance,  even  in  the  presence 
of  parametric  polymorphism  and  type  recursion.  As  mentioned  earlier,  our  interpretation  is  carried 
out  by  translating  the  fuff  calculus  into  a  calculus  without  inheritance  (the  target  calculus)  whose 
semantics  we  already  understand.  However,  our  idea  differs  significantly  from  the  interpretation 
of  the  untyped  A-calculus  as  described  above  in  at  least  one  important  respect;  typically,  the 
coercions  (such  as  $  and  'F  above)  which  we  introduce  will  be  definable  in  the  target  calculus. 
Hence  our  target  calculus  needs  to  be  an  extension  of  the  ordinary  polymorphic  A-calculus  with 
records,  variants,  abstract  types,  and  recursive  types.  But  it  need  not  have  any  inheritance. 
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From  this  lead,  we  may  now  propose  a  way  to  explain  the  semantics  of  an  expression  in  a 
language  with  inheritance.  Our  semantics  interprets  typing  judgements,  i.e.  assertions  F  h  e:  s 
that  expression  e  has  type  s  in  context  F.  Ordinarily  such  a  judgement  is  assigned  a  semantics 
inductively  in  the  proof  of  the  judgement  using  the  typing  rules.  However,  the  system  we  are 
considering  may  also  include  instances  of  the  inheritance  rule  which  says  that  if  e  has  type  s  and  s 
is  a  subtype  of  t,  then  e  has  type  t.  How  are  we  to  relate  the  interpretation  of  the  type  expressions 
s  and  t  so  that  the  meaning  of  e  can  be  viewed  as  Uving  in  both  places?  Our  proposal:  the  proof 
that  s  is  a  subtype  of  t  generates  a  coercion  P  from  s  into  t.  The  inheritance  (subsumption)  rule 
is  interpreted  by  the  application  of  the  coercion  P  to  the  interpretation  of  e  as  an  element  of  s.  It 
wiU  be  seen  below  that  this  technique  can  be  made  to  work  very  smoothly  since  the  language  we 
are  interpreting  may  have  a  familiar  inheritance-free  fragment  in  which  coercions  such  as  P  can 
be  defined.  In  effect,  we  can  therefore  “project”  the  language  onto  an  inheritance-free  fragment  of 
itself. 

For  further  illustration,  let  us  now  look  at  an  example  which  combines  parametric  polymorphism 
and  inheritance.  In  the  polymorphic  A-caJculus,  it  is  possible  to  form  expressions  in  which  there  are 
abstractions  over  type  variables.  For  example,  the  term  e  =  Aa.  Xx:  a.  x  is  an  operator  which  takes 
a  type  s  as  an  argument  and  returns  the  identity  function  Xx:  s.  x  on  that  type  as  a  value.  The  type 
of  e  is  indicated  by  the  expression  Va.  a  a.  Semantically,  one  may  think  of  the  meaning  of  this 
expression  as  an  indexed  product  where  a  ranges  over  all  types.  Although  this  explanation  is  a  bit 
too  simple  as  it  stands,  it  does  help  with  the  basic  intuition.  If  one  wishes  to  make  an  abstraction 
over  the  subtypes  of  a  given  type,  one  may  use  the  concept  of  a  bounded  quantification  [CW85]. 
Consider,  for  example,  the  term 


e'  =  Aa  <  {l:s}.  Ax:  a.  {x.l) 

where  {/:s}  is  a  record  expression  which  has  one  field,  labelled  I,  with  type  s.  The  expression  e' 
denotes  an  operator  which  takes  a  subtype  t  of  {l:s}  (we  write  t  <  {l:s})  and  returns  as  value  a 
function  from  t  to  s.  (The  reader  should  not  confuse  a,  a  type  variable,  with  t,  a  type  expression.) 
Intuitively,  a  subtype  of  {1:5}  is  a  record  which  has  an  /  field  whose  type  is  a  subtype  of  s.  The 
type  of  e'  is  indicated  by  the  expression  u'  =  Va  <  {l:s}.  a  —>■  s.  How  should  we  think  of  this  type 
semantically?  Taking  an  analogy  with  the  intuitive  semantics  of  polymorphic  quantification,  we 
want  to  think  of  the  meaning  of  u'  as  some  kind  of  indexed  product.  But  indexed  over  what?  In 
this  paper  we  argue  that  one  may  get  an  intuitive  semantics  of  bounded  quantification  by  thinking 
of  a  type  expression  such  as  u'  as  a  family  of  types  indexed  over  coercions  (i.e.  certain  functions) 
from  a  type  a  into  the  type  s. 

To  support  this  intuition  we  must  explain  the  meaning  of  the  apphcation  e'{t)  of  the  expression 
e'  to  a  type  expression  t  which  is  a  subtype  of  {I'-s}.  The  key  fact  is  this:  given  type  expressions 
V  and  w  and  a  proof  that  u  is  a  subtype  of  w,  there  is  a  canonical  coercion  from  v  into  w.  Hence, 
the  application  e'{t)  has,  as  its  meaning,  the  element  of  t  —*  s  obtained  by  applying  the  meaning 
of  e' — which  is  an  element  of  an  indexed  product — to  the  canonical  coercion  from  t  to  {l:s}.  This 
leads  us  to  consider  u'  as  the  type 


Va.  (a  o->{l:  s})  —>■  a  —>■  s 

where  a  s}  is  a  “type  of  coercions”.  In  category-theoretic  jargon:  the  meaning  of  a  bounded 
quantification  with  bound  v  will  be  an  adjoint  to  a  fibration  over  the  slice  category  over  v.  This 
follows  the  analogy  with  models  of  polymorphism  which  are  based  on  adjoints  to  fibrations  over 
the  category  of  aU  domains  (as  in  [CGW89]  for  example). 
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Although  we  believe  that  the  translation  just  illustrated  is  intuitive,  we  need  to  show  that  it 
is  coherent.  In  other  words,  we  must  show  that  the  semantic  function  is  well  defined.  The  need 
for  coherence  comes  from  the  fact  that  a  typing  judgement  may  have  many  different  derivations. 
In  general,  it  is  customary  to  present  the  semantics  of  typed  lambda  calculi  as  a  map  defined 
inductively  on  type-checking  derivations.  Such  a  method  would  therefore  assign  a  meaning  to 
each  derivation  tree.  We  do  believe  though,  that  the  language  consists  of  the  derivable  typing 
judgements,  rather  than  of  the  derivation  trees.  For  many  calculi,  such  as  the  simply  typed  or  the 
polymorphic  lambda  calculus,  there  is  at  most  one  derivation  for  any  typing  judgement.  Therefore, 
in  such  calculi,  giving  meaning  to  derivations  is  the  same  as  giving  meaning  to  derivable  judgements. 
But  for  other  calculi,  such  as  Martin-Lof’s  Intuitionistic  Type  Theory  (ITT)  [Mar84]  (see  [Sal88]), 
and  the  Calculus  of  Constructions  [CH88]  (see  [Str88]),  and — of  immediate  concern  to  us — CardeUi 
and  Wegner’s  Fun,  this  is  not  so,  and  one  must  prove  that  derivations  yielding  the  same  judgement 
are  given  the  same  meaning.  This  idea  has  also  appeared  in  the  context  of  category  theory  and  our 
use  of  the  term  ’’coherence”  is  partially  inspired  by  its  use  there,  where  it  means  the  uniqueness 
of  certain  canonical  morphisms  (see  e.g.  [KL71]  and  [LP85]).  Although  we  have  not  attempted 
a  rigorous  connection  in  this  paper,  the  possibility  of  unifying  coherence  results  for  a  variety  of 
different  calculi  offers  an  interesting  direction  of  investigation.  In  the  case  of  Fun,  we  show  the 
coherence  of  our  semantic  approach  by  proving  that  translations  of  any  two  derivations  of  the  same 
typing  judgement  are  equated  in  the  target  calculus. 

Hence,  the  coherence  of  a  given  translation  is  a  property  of  the  equational  theory  of  the  target 
calculus.  When  the  target  calculus  is  the  polymorphic  lambda  calculus  extended  with  records  and 
recursive  types,  the  standard  axiomatization  of  its  equational  theory  is  sufficient  for  the  coherence 
theorem.  But  when  we  add  variants,  the  standard  axiomatization  of  these  features,  while  sufficient 
for  coherence,  clashes  with  the  standard  axiomatization  of  recursive  types,  yielding  an  inconsistent 
theory  (see  [Law69,  HP89a]  for  variants,  that  is,  coproducts).  The  solution  lies  in  two  observations: 
(1)  the  (too)  strong  axioms  are  only  needed  for  “coercion  terms”,  and  (2)  in  the  various  models  we 
examined  these  coercion  terms  have  special  interpretations  (such  as  strict,  or  /mean maps),  so  special 
in  fact,  that  they  satisfy  the  corresponding  restrictions  of  the  strong  axioms!  Correspondingly,  one 
has  to  restrict  the  domains  over  which  “coercion  variables”  can  range,  which  leads  naturally  to  the 
type  of  coercions  mentioned  above. 

3  Translation  for  a  fragment  of  the  calculus 

For  pedagogical  reasons,  we  begin  by  considering  a  language  whose  type  structure  features  function 
spaces  (exponentials),  record  types,  bounded  generic  types  (an  inherit ance-generaUzed  form  of 
universal  polymorphism),  recursive  types,  and,  of  course,  inheritance.  In  the  next  section  we  wiU 
enrich  this  calculus  by  the  addition  of  variants.  As  we  have  mentioned  before,  this  leads  to  some 
(interesting)  complications  which  we  avoid  by  restricting  ourselves  to  the  simpler  calculus  of  this 
section.  Since  the  calculus  in  the  next  section  is  stronger,  we  omit  details  for  the  proofs  of  results 
in  this  section.  They  resemble  the  proofs  for  the  calculus  with  variants,  but  the  calculations  are 
simpler.  Rather  than  generate  four  different  names  for  the  calculi  which  we  shall  consider  in 
this  section  and  the  next  we  simply  refer  to  the  calculus  with  inheritance  as  SOURCE  and  the 
inheritance-free  calculus  into  which  it  is  translated  as  TARGET.  The  fragment  of  the  calculus 
which  we  consider  in  this  section  is  fully  described  in  the  appendices  to  the  paper. 

We  provide  semantics  to  SOURCE  via  a  translation  into  a  language  for  which  several  weU- 
understood  semantics  already  exist.  This  “target”  language,  which  we  shall  call  TARGET,  is  an 
extension  with  record  and  recursive  types  of  the  Girard- Reynolds  polymorphic  lambda  calculus 
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(see  [CGW87]  for  the  semantics  of  TARGET).  Therefore,  SOURCE  extends  with  inheritance 
and  bounded  generics  TARGET,  which  is  at  its  turn  an  extension  of  what  Girard  calls  System 
F  in  [Gir86].  Our  translation  takes  derivations  of  inheritance  and  typing  judgements  in  SOURCE 
into  derivations  of  typing  judgements  in  TARGET.  We  translate  the  inheritance  judgements  of 
SOURCE  into  definable  terms  of  TARGET  which  can  be  thought  of  as  canonical  explicit  coer¬ 
cions.  Bounded  generics  translate  into  usual  generics,  but  of  “higher”  type,  which  take  an  additional 
argument  which  can  be  thought  of  as  an  arbitrary  coercion. 

In  arguing  that  this  translation  yields  a  semantics  for  SOURCE,  we  encounter,  as  mentioned 
in  the  introduction,  an  important  comphcation:  as  we  shall  see,  in  SOURCE  as  weU  as  in  Fun, 
there  may  be  several  distinct  derivations  of  the  same  typing  judgement  (or  inheritance  judgement, 
for  that  matter).  We  consider,  however,  the  language  to  consists  of  the  derivable  typing  judgements, 
rather  than  of  the  derivation  trees.  This  distinction  can  be  ignored  in  System  F  or  TARGET, 
where  there  is  at  most  one  derivation  for  any  typing  judgements,  so  giving  meaning  to  derivations 
is  the  same  as  giving  meaning  to  derivable  judgements.  But  for  SOURCE  and  Fun,  this  is  not  so, 
and  one  must  show  that  derivations  yielding  the  same  judgement  are  given  the  same  meaning.  This 
meaning  is  then  defined  to  be  the  meaning  of  the  judgement.  This  crucial  problem  was  overlooked 
by  publications  on  the  semantics  of  inheritance  prior  to  [BCGS89]. 

We  solve  the  problem  as  follows.  It  turns  out  that  our  translation  takes  syntactically  distinct 
derivations  of  the  same  SOURCE  judgement  into  syntactically  distinct  derivations  in  TARGET. 
But  we  give  an  equational  axiomatization  as  an  integral  part  of  TARGET,  and  we  show  that 
our  translation  takes  derivations  of  the  same  SOURCE  judgement  into  derivations  of  provably 
equal  judgements  in  TARGET.  By  this  coherence  result,  any  model  of  TARGET,  being  also  a 
model  of  its  equational  theory,  wiU  provide  a  well-defined  semantics  for  the  derivable  judgements 
of  SOURCE. 

The  source  calculus.  For  notation,  we  will  foUow  the  spirit  of  Fun  [CW85]  making  precise  only  the 
differences.  The  type  expressions  include  type  variables  a  and  a  distinguished  constant  Top.  If  s 
and  t  are  type  expressions,  then  s  —<■  t  is  the  type  of  functions  from  s  to  t.  If  Si, . .  .,Sn  are  type 
expressions,  and  l\,. .  .,ln  is  a  collection  of  distinct  labels,  then  {li:  sj, . . . ,  -Sn}  is  a  record  type 
expression.  We  make  the  syntactic  assumption  that  the  order  of  the  labels  is  irrelevant.  If  s  and  t 
are  type  expressions  then  Va<s.  t  is  a  bounded  quantification  which  binds  free  occurrences  of  the 
variable  a  in  the  type  expression  t  (but  not  in  s).  Similarly,  pa.  t  is  a  recursive  type  expression  in 
which  the  type  variable  a  is  bound  in  the  type  expression  t.  Intuitively,  pa.  t  is  the  solution  of  the 
equation  a  =  t.  We  will  use  [s/a]t  for  substitution.  The  raw  terms  of  the  language  include  (term) 
variables  x,  applications  d{e)  and  lambda  abstractions  Xxit.e.  An  expression  {/i  =  ei, . .  .,/„  =  e„} 
is  called  a  record  with  fields  Zi, . .  .,/„  and  the  expression  e.l  is  the  selection  of  the  field  Z.  Again, 
we  assume  that  the  order  of  the  fields  of  a  record  is  irrelevant,  but  the  labels  must  all  be  distinct. 
We  also  have  bounded  type  abstraction  Aa<t.  e  and  the  corresponding  application  e(Z).  To  form 
terms  of  recursive  type  pa.t  we  have  intro  expressions  intro[/xo.Z]e  and  they  are  eUminated  from  the 
recursion  by  elim  expressions  elim  e.  See  Appendix  A  to  find  a  grammar  for  the  type  expressions 
and  raw  terms  of  the  fragment. 

Raw  terms  are  type-checked  by  deriving  typing  judgements,  of  the  form  F  h  e  :  t  .  where  F  is  a 
context.  Contexts  are  defined  recursively  as  follows:  0  is  a  context;  if  F  is  a  context  which  does  not 
declare  a,  and  the  free  variables  of  t  are  declared  in  F,  then  F,  a<Z  is  a  context;  if  F  is  a  context 
which  does  not  declare  x,  and  the  free  variables  of  t  are  declared  in  F,  then  F,  x:t  is  a  context. 
The  proof  system  for  deriving  typing  judgments  is  the  relevant  fragment  of  the  corresponding 
proof  system  for  Fun  (see  [CW85]  on  pages  519-520)  enriched  with  two  type-checking  rules  for  the 
introduction  and  eUmination  of  recursive  types  [CGW87].  A  complete  hst  of  these  proof  rules  is  in 
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Appendix  A  under  the  heading  Fragment. 

Among  these  proof  rules,  the  following  two  illustrate  the  effect  of  inheritance  on  type- checking: 


[INH] 


ri~e:5  T  s  <  t 
T  h  e:t 


[B-SPEC] 


ri-e:Va<s.  t  Pt-r^s 

r  P  e{r)  :  [r/a]t 


They  make  use  of  inheritance  judgements  which  have  the  form  C  s  <  t  where  C  is  an 
inheritance  context.  Inheritance  contexts  are  contexts  in  which  only  declarations  of  the  form  a<  t 
appear.  If  P  is  a  context,  we  denote-by  P  teh  inheritance  context  obtained  from  P  by  erasing  the 
declarations  of  the  form  x:t.  The  proof  system  for  deriving  inheritance  judgments  is,  with  the 
exception  of  one  rule,  the  same  as  the  relevant  fragment  of  the  corresponding  proof  system  for  Fun 
(see  [CW85],  on  page  519).  In  this  paper  we  do  not  attempt  to  enrich  it  with  any  rule  deriving 
inheritances  between  recursive  types.  A  discussion  of  this  issue  appears  in  our  conclusions.  The 
Appendix  contains  a  complete  hst  of  these  proof  rules  too. 

In  comparison  with  Fun,  we  would  like  to  strengthen  the  rule  deriving  inheritances  between 
bounded  generics,  and  we  are  able  to  do  so  for  some  of  our  results.  Where  Fun  had  just 


(W-FORALL) 


we  will  consider 


(FORALL) 


C,  a <t  F  u  <  V 
C  P  Va<t.  u  <  Va<t.  V 


C  s  <  t  C,a<s  i-  u  <  V 
C  P  'ia<t.u  <  'ia<s.v 


This  makes  the  system  strictly  stronger,  allowing  more  inheritances  to  be  derived,  and  thus  more 
terms  to  type- check. 

Originally,  we  believed  that  coherence  could  be  proved  for  a  system  that  includes  variants 
and  the  stronger  rule  (FORALL)  [BCGS89].  In  dealing  with  the  case  construct  for  variant  types, 
however,  our  coherence  proof  uses  an  order- theoretic  property  (see  Lemma  11)  which  falls  for  the 
stronger  system  for  deriving  inheritances  that  uses  (FORALL)  (for  a  counterexample,  see  Giorgio 
GeHi’s  dissertation  [Ghe90]).  Thus,  we  prove  the  coherence  of  the  translation  of  variants  (Theo¬ 
rem  13)  only  for  the  weaker  system  with  (W-FORALL).  Note,  however,  that  we  prove  coherence 
in  the  presence  of  (FORALL)  for  the  system  without  variants  (Theorem  4)  and  for  the  system  for 
deriving  inheritances  between  types,  including  variant  types  (Lemma  9). 

Remark.  Decidability  of  type-checking  in  the  stronger  system  is  a  non-trivial  question.  The 
question  whether  an  algorithm  of  Luca  CardeUi  will  decide  the  provability  of  judgements  in  this 
calculus  has  only  recently  been  settled  by  Ghelli  [Ghe90]. 

The  salient  feature  of  bringing  inheritance  into  a  type  system  is  that  (in  given  contexts)  terms 
will  not  have  a  unique  type  any  more.  For  example,  due  to  the  rule 


(TOP) 


C  P  t  <  Top 


where  the  free  variables  of  t  are  declared  in  C,  by  [INH],  aU  terms  that  type-check  with  some  type 
wiU  also  type-check  with  type  Top.  This  makes  it  possible  to  define  ordinary  generics  as  syntactic 
sugar:  Va.  t  'ia<Top.t  . 
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The  proof  system  for  SOURCE,  while  quite  intuitive,  allows  for  the  following  complication: 
there  may  be  more  than  one  derivation  of  the  same  typing  judgement.  In  fact,  we  only  need 
record  types,  (RECD),  [VAR],  [SEL]  and  [INH]  (see  Appendix)  to  provide  such  an  example:  in  the 
context  x:  {li:  Top,!^'.  Top}  ,  we  can  either  directly  derive  by  [SEL]  x.li  :  Top  ,  or  we  can  derive 
by  [VAR]  X  :  {/i:  Top.W-  Top]  ,  then  by  (RECD)  and  [INH]  x  :  {k:  Top]  ,  and  finally  by  [SEL] 
x./i  :  Top  .  In  view  of  this,  for  any  semantics  given  by  “induction  on  the  rules”,  one  needs  to  prove 
that  derivations  of  the  same  judgement  have  the  same  meaning. 

The  target  calculus.  As  mentioned  before,  TARGET  is  the  Girard- Reynolds  polymorphic  lambda 
calculus,  enriched  with  record  and  recursive  types  [CGW87,  BC88,  CGW89].  Here,  we  present  it 
as  a  simplification  of  SOURCE.  Types  are  given  by 

a  I  s  — I  Va.  t  I  {ll'.Si,  . .  .,ln'  -Sn}  i  po-- 1 


and  terms  by 

X  I  M(N)  I  Ax:  t.  M  \  Aa.  M  \  M{t)  \  {/j  =  Mi, . .  =  |  M.l  \  intro[/ia.  t]M  \  elim  M 

For  n  =  0  we  get  the  the  empty  record  type  1  {}  and  the  empty  record,  for  which  we  wiU  keep 

the  notation  {}  .  Typing  contexts  are  the  obvious  simplification  of  contexts  in  which  only  typing 
judgements  occur  (there  is  no  inheritance  relation  in  TARGET).  The  rules  for  deriving  typing 
judgements  in  the  fragment  of  TARGET  discussed  in  this  section  can  be  found  in  Appendix  B. 
The  following  is  a  well-known  fact: 

Proposition  1  In  TARGET,  derivations  of  typing  judgements  are  unique. 

Proof:  All  the  ’’elimination”  rules,  [APPL],  [SEL],  [SPEC],  and  [R-ELIM]  are  ’’cut”  rules,  in  the 
sense  that  there  is  information  in  the  premisses  that  does  not  appear  in  the  conclusion.  Con¬ 
sequently,  they  should  in  principle  cause  problems  for  the  uniqueness  of  derivations.  However, 
the  lost  information  is  always  in  the  type  part,  and  types  “should”  be  unique.  This  suggests  the 
strengthening  of  the  induction  hypothesis,  which  then  passes  trivially  through  these  ’’cut”  rules. 

One  proves  therefore  that  for  any  two  derivations  Ai  and  A2,  if  Ai  ends  in  T  I-  M  :  ti  and 
A2  ends  in  T  h  M  :  ^2  then  Ai  =  A2  (in  particular,  ti  =  ^2  )• 

The  proof  can  be  done  straightforwardly,  either  by  induction  on  the  maximum  of  the  heights 
of  Ai  and  A2,  or  on  the  sum  of  those  heights,  or  even  on  the  structure  of  M  (with  a  bit  of 
reformulation).  | 

A  technical  point:  it  turns  out  that  type  decorations  are  unnecessary  on  “elimination”  con¬ 
structs,  but  they  are  in  fact  necessary  on  some  “introduction”  constructs,  such  as  lambda  abstrac¬ 
tion  and  the  recursive  type  construct  intro[].  Later  on,  with  the  addition  of  variants  in  section  4,  we 
wiU  find  that  we  need  to  differ  with  [CW85],  and  decorate  with  types  the  constructs  that  “inject” 
into  variant  types  (see  Appendix  B). 

Equations  are  derived  by  a  proof  system  (see  [CGW87,  BC88,  CGW89])  which  contains  rules  like 
reflexivity,  symmetry,  transitivity,  congruence  with  respect  to  function  application,  closure  under 
functional  abstraction  (^),  congruence  with  respect  to  application  to  types,  closure  with  respect  to 
type  abstraction  (type  ffj.  There  are  also  the  {BETA}  and  {ETA}  rules  for  both  functional  and 
type  abstraction,  rules  saying  that  intro[  ]  and  elim  are  inverse  to  each  other,  as  well  as 


{RECD-BETA} 


{ll—  Ml,  .  .  .,ln  —  Mjij.li  —  Mi 
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where  n  >  1,  and 


{RECD-ETA}.  {li  =  M.li,...,ln  =  M.ln}  =  M 

where  M  :  {/j;  Si 5^}  -The  last  rule  gives,  for  n  =  0,  the  equation  {}  =  M  which 
makes  1  into  a  terminator.  Under  our  interpretation,  the  type  Top  wiU  be  nothing  like  a  “universal 
domain”  which  can  be  used  to  interpret  Type:  Type  [CGW89,  GJ90].  On  the  contrary,  it  wiU  be 
interpreted  as  a  one  point  domain  in  the  models  we  Ust  below! 

The  translation.  For  any  SOURCE  item  we  wiU  denote  by  item*  its  translation  into  TARGET. 
We  begin  with  the  types.  Note  the  translation  of  bounded  generics  and  of  Top. 


a* 

def 

a  {/i 

.  Sj ,  .  .  . ,  In- 

^=1'  {h-.S*,...,ln-S*n} 

Top* 

def 

1 

(Va<s.t)* 

Va.  (a->s*)->r 

(s  ty 

d_ef 

s*  t* 

(pa.  ty 

def 

=  pa.  t 

One  shows  immediately 

that 

{[sla]ty  = 

[s*Ja]t*  .  We  extend  this  to  contexts  and  inheritance 

contexts,  which  translate 

:  into 

just  typing  contexts  in  TARGET. 

0* 

def 

0 

0* 

0 

(r,  a<ty 

def 

r*,  a,  f:a-^t*  (G,  a<t)* 

C*,a,f:a^t* 

(T,  x:ty 

T*,x:t* 

where  /  is  a  fresh  variable  for 

each  a. 

Next  we  wiU  describe  how  we  translate  the  derivations  of  judgments  of  SOURCE.  The  transla¬ 
tion  is  defined  by  recursion  on  the  structure  of  the  derivation  trees.  Since  these  are  freely  generated 
by  the  derivation  rules,  it  is  sufficient  to  provide  for  each  derivation  rule  of  SOURCE  a  correspond¬ 
ing  rule  on  trees  of  TARGET  judgments.  It  wiU  be  a  lemma  (Lemma  2  to  be  precise)  that  these 
corresponding  rules  are  directly  derivable  in  TARGET,  therefore  the  translation  takes  derivations 
in  SOURCE  into  derivations  in  TARGET. 

A  SOURCE  derivation  yielding  an  inheritance  judgment  C  b  s  <  t  is  translated  as  a  tree 
of  TARGET  judgments  yielding  C*  b  P  :  s*  t*  .  We  present  three  of  the  rules  here;  the  fuU 
Ust  for  the  fragment  appears  in  Appendix  C.  The  coercion  into  Top  is  simply  the  constant  map: 

(TOP)*  G*  b  Ax:r.  {}  : 


To  see  how  coercion  works  on  types,  assume  that  we  are  given  a  coercion  P:s-^t  from  s  into  t 
and  a  coercion  Q:u  —<■  v  from  u  into  v.  Then  it  is  possible  to  coerce  a  function  f:t  u  into  a 
function  from  s  to  u  as  foUows.  Given  an  argument  of  type  s,  coerce  it  (using  P)  into  an  argument 
of  type  t.  Apply  the  function  /  to  get  a  value  of  type  u.  Now  coerce  this  value  in  u  into  a  value 
in  V  by  applying  Q.  This  describes  a  function  of  the  desired  type.  More  formaUy,  we  translate  the 
(ARROW)  rule  by 


(ARROW)* 


G*  b  P  :  s*^r  C*  h  Q  : 

G*  b  P  :  {t* —*u*)—>-{s* —^v*) 


where  R  Xz:t* -^u* .  P]  z-,Q  .  (We  use  ;  as  shorthand  for  composition.  For  example,  P',z-,Q 
above  stands  for  Xx:s*.  Q{z{P{x)))  where  x  is  fresh.)  Now,  to  translate  the  rule  (FORALL) 
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which  describes  the  inheritance  relation  for  the  bounded  quantification  we  view  the  quantification 
as  ranging  over  a  type  together  with  a  coercion  from  that  type  into  the  bound: 

C*hP:s*^r  C,a,  f:a^s*  h  Q  u*^v* 

(FORALL)  C*  R:  (a-^s*)^v*) 

where  R  Xz:  {\/a.  {a^t*)—^u*).  Aa.  Xf:  s* .  Q(z(a){f;  P)) 

Now,  a  SOURCE  derivation  yielding  an  typing  judgment  F  h  e  :  t  is  translated  as  a  tree  of 
TARGET  judgments  yielding  F*  b  M  :  t*  .  For  example,  the  inheritance  rule  is  translated  by 
simply  making  the  inheritance  coercion  “explicit”: 


[INK]* 


T*  h  M  :  s*  f*  h  P  :  s*-^r 
T*  h  P(M)  :  t* 


The  specialization  of  a  bounded  quantification  is  more  subtle.  The  variable  is  instantiated  by 
substituting  the  type  expression  to  which  the  abstraction  is  apphed,  but  then  the  coercion  from 
the  argument  type  to  the  bound  type  must  be  passed  as  an  argument  to  the  resulting  function: 


[B-SPEC]* 


F*  h  M  :  ya.{a^s*)-^r  f*  h  P:r*^s* 
F*  h  M(r*)(P)  :  [r*/a]f 


The  remaining  rules  for  translating  the  fragment  are  given  in  Appendix  C.  It  is  possible  to  check 
that  the  translated  rules  are  derivable  in  the  target  language: 

Lemma  2  The  rules  (TOP)*  -  (TRANS)*  and  [VAR]*  -  [INHj*  are  directly  derivable  in  TAR¬ 
GET.  I 


Coherence  of  the  translation.  For  any  derivation  A  in  SOURCE,  let  A*  be  the  TARGET  deriva¬ 
tion  into  which  it  is  translated.  The  central  result  about  inheritance  judgements  says  that,  given 
a  judgement  s  <  t  and  a  pair  of  proofs  Ai  and  A2  of  this  judgement,  the  coercions  induced  by 
these  two  proofs  are  provably  equal  in  the  equational  theory  of  TARGET.  More  formally,  we  have 
the  following: 

Lemma  3  (Coherence  of  the  translation  of  inheritance)  Let  Ai  and  A2  be  two  SOURCE 

derivations  of  the  same  inheritance  judgement,  C  s  <  t  .  Let  Aj,  Aj  yield  (coercion)  terms 
Pi,P2.  Then,  Pi  =  P2  zs  prot;a6/e  m  TARGET. 

The  central  result  about  tz/pmg  judgements  says  that,  given  a  judgement  e:  t  and  a  pair  of  proofs 
Ai  and  A2  of  this  judgement,  the  translations  of  these  proofs  end  in  sequents  (translations  of  e:  t) 
which  are  provably  equal  in  the  equational  theory  of  TARGET,  i.e.  we  have: 

Theorem  4  (Coherence)  Let  Ai  and  A2  be  two  SOURCE  derivations  yielding  the  same  typing 
judgement,  F  b  e  \  t  .  Let  AI,  yield  terms  Mi,  M2.  Then,  Mi  =  M2  is  provable  in 
TARGET. 

The  proofs  of  the  lemma  and  theorem  are  almost  as  difficult  as  the  ones  we  shall  give  for  the 
corresponding  results  in  the  full  language.  Since  the  proofs  of  these  results  for  the  fragment  follow 
similar  fines  to  the  proofs  for  the  full  language  we  omit  the  proofs  of  Lemma  3  and  Theorem  4  in 
favor  of  the  proofs  of  Lemma  9  and  Theorem  13  below. 
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4  Between  incoherence  and  inconsistency:  adding  variants 

The  calculus  described  so  far  does  not  deal  with  a  crucial  type  constructor:  variants.  In  particular, 
it  is  very  useful  to  have  a  combination  of  variant  types  with  recursive  types.  On  the  other  hand,  the 
combination  of  these  operators  in  the  same  calculus  is  also  problematic,  especially  for  the  equational 
theory.  The  situation  is  familiar  from  both  domain  theory  and  proof  theory.  In  this  section  we 
propose  an  approach  which  will  suffice  to  prove  the  coherence  theorem  which  we  need  to  show  that 
our  semantic  function  is  well-defined. 

We  extend  the  type  formation  rules  of  SOURCE  by  adding  variant  type  expressions: 
[/i:  ti, . . . ,  In'  tn]  where  n  >  1.  We  also  extend  the  term  formation  rule  by  the  formation  of  variant 
terms  [/i :  ,...,  /^  =  e,  ^n]  and  the  case  statement: 

case  e  of  /i  =»  /i , . . . ,  =>  /„ 

The  inheritance  judgement  derivation  rules  are  extended  correspondingly  with  the  rule: 


(VART) 


C  Si  <  ti  •  •  •  C  Sp  <  tp 
C  b  [/j :  Si , . . . , /p.  Sp]  ^  [fj.  tj, . . . , /p.  tp, . . . , /^. 


Note  the  “duality”  between  this  rule  and  the  inheritance  rule  (RECD)  for  records  (see  Appendix 
A).  While  a  record  subtype  has  more  fields,  a  variant  subtype  has  fewer  variations  (summands). 

Like  before,  we  intend  to  translate  this  calculus  into  a  calculus  without  inheritance  and,  nat¬ 
urally,  we  extend  TARGET  with  variants  (see  Appendix  B).  Note  how  the  syntax  of  variant 
injections  differs  from  [CW85].  This  is  in  order  for  the  resulting  system  to  enjoy  the  property  of 
having  unique  type  derivations:  the  proof  of  Proposition  1  extends  immediately  to  the  variant  con¬ 
structs.  Most  importantly,  we  must  extend  the  equational  theory  of  TARGET  in  a  manner  that 
insures  the  coherence  of  our  translation.  It  is  here  that  we  encounter  an  interesting  problem  which 
readers  who  know  domain  theory  will  find  familiar.  The  following  two  cixioms  hold  in  a  variety  of 
models: 


{VART-BETA}  case  inj(,(Mi)  of /i  =^fi, . . . ,/n=^E„  =  Fi{Mi) 

where  F\  :  t\^t,. .  .,Fn  :  tn—>-t,  Mi  :  ti  and  inj;,  is  shorthand  for 

Ax .  ti .  [/^ .  ,  .  .  . ,  fi  —  X ,  .  .  . ,  /ji.  tn\ . 


{VART-ETA}  case  M  of /i  =>inj/j,  inj;^  =  M 

where  M  :  [Ip.  ti, . . . ,  t„]  .  Unfortunately,  these  two  ajcioms  do  not  suffice  to  prove  all  the 

identifications  required  by  the  coherence  of  our  translation! 

To  see  the  problem,  we  start  with  an  example.  In  SOURCE,  suppose  that  t  <  s  is  derivable 
in  the  context  P,  and  that  we  have  a  derivation  A  of  T  b  e  :  [li:ti,l2'h]  and  derivations  A,-  of 
r  b  fi  :  ti  —<■  t,  i  =  1,2.  Consider  then  the  following  two  SOURCE  derivations  of  the  typing 
judgement  T  b  case  e  ol  li=^  fi,l2=>  f2  '•  s  . 

1.  by  A,  Ai,  A2  and  the  rule  [CASE],  one  deduces  T  b  case  e  of  /i  /i,Z2  =>  /2  :  U  Since 
r  b  t  <  5  by  hypothesis,  one  infers  by  inheritance  E  b  case  e  of  Zi  ^  /i,  Z2  => /2  :  s. 

2.  from  f  b  t  <  s  we  can  deduce  T  b  (t,-  — *■  t)  <  (t,-  — »•  s).  Hence,  by  inheritance  from 
A,-,  one  deduces  E  b  /,■  :  t,-  s.  Then,  from  A  and  by  the  rule  [CASE],  one  deduces 
E  b  case  e  of  Zi  /i ,  /2  =>  /2  :  s. 
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The  coherence  property  requires  that  these  two  derivations  have  provably  equal  translations.  With 
the  obvious  translation  for  the  variant  type  constructor  and  the  rules  [VART]  and  [CASE]  (see 
Appendix  C)  and  with  the  translation  of  the  rules  [INH],  (ARROW)  and  (REEL)  as  in  Section  3, 
this  comes  down  to  the  following  identity 

P(case  M  of /i=>Ei,/2=^E’2)  =  case  M  of /i  =>(Pi;  P),/2=^>(p2;  .P) 

where  P  :  t* s*  is  a  “coercion  term”,  M  :  ^  «  =  1,2  .  Thus,  we  are 

tempted  to  postulate 

{VART-CRN?}  P(case  M  of /i=>Pi,. .  .,/„^P„)  =  case  M  of =i>Pi;  P, . . . , /„=>P„;  P 

where  M:  [/i;  ti,  t„],  Fi'.ti t, . . . ,  Fn'.tn —>■  t,  P:t—^s.  This  equation  follows  from  the 

equation  that  axiomatizes  variants  analogously  to  coproducts: 

{VART-COP?}  Q(M)  =  C2se  M  of 

where  M:  [/i:  ti,  t„],  (Q:  [/i:  ti,  .  More  precisely,  it  is  possible  to  check  that  the 

system  {VART-BETA}+{VART-COP}  is  equivalent  to  {VART-BETA}+{VART-CRN}+{VART- 
ETA}.  However,  it  is  known  [Law69,  HP89a]  that  {VART-BETA}+{VART-COP}  is  inconsistent 
with  the  existence  of  fixed-points.  In  fact,  this  may  be  refined: 

Proposition  5  The  system  {VART-BETA}+{VART-CRN}  is  (equationally)  inconsistent  with  the 
existence  of  fixed-points. 

Proof:  The  “categorical”  equation  {  VART-COP  }  may  be  thought  of  as  an  “induction”  princi¬ 
ple  on  a  sum:  it  reduces  the  proof  of  an  equation  P(M)  =  Q{M),  M:  [/i:  tj, /2:  ^2],  to  the  proofs 
of  P(inj;j(a;))  =  (3(inj;j(x)),  for  x\ti  and  P(inj^2(^))  =  (3(inj/2(a;)),  for  x:t2.  Indeed,  we  have 
P{M)  =  case  M  of  li  =>  Xx.  P(inj;j(a:)),/2  =?>  Xx.  P{m}ifix))  and  Q{M)  =  case  M  of  li 
Xx.  Q(f\n\ifix)),l2  Xx.  (3(inj;2(a;)).  Given  a  type  t,  it  is  possible  to  define  a  “negation-like” 
operation  on  [li-.t,l2.t]  by  neg(M)  =  case  M  of  li  ^  Ax.inj;2(a:), ^2  Ax.inj;j(x).  Given  x,y:t, 
it  is  easy  enough  to  define  an  operation  f{M,N):t,  for  M,N\[li:t,l2'.t]  in  such  a  way  that 
/(injq(u),inj,j(u))  =  =  x,  and  /(inj;j(u),  inj,^(u))  =  /(inj,j(u),  inj;^(u))  =  y.  We 

deduce  then  from  the  “induction  principle”  that  f{M,M)  =  x,  and  /(A/,  neg(M))  =  y,  identically 
for  M:  [l-i:t,l2:t],  hence  the  (equational)  inconsistency  when  we  have  a  fixed-point  combinator. 

The  fact  that  we  can  use  instead  of  {VART-COP?}  +  {VART-BETA}  the  weaker  system 
{VART-BETA}  {VART-CRN?}  comes  simply  from  the  fact  that  we  can  “relativise”  this  reason¬ 
ing  to  the  elements  of  [li:t,l2:t]  of  the  form  case  M  of  elements  that  satisfy  the  equation 

{  VART-ETA  }.  I 

Thus,  a  naive  approach  gives  us  an  unattractive  choice  between  incoherence  and  inconsistency! 
We  are  saved  from  this  by  the  observation  that,  at  least  in  the  example  above,  we  do  not  seem  to 
need  the  “fuU”  usage  of  {VART-CRN}  but  only  those  instances  in  which  P  is  a  term  coming  out 
of  a  translation  of  an  inheritance  judgement,  i.e.,  a  “coercion  term”.  Such  terms  are  much  simpler 
than  general  terms.  In  particular,  we  note  that  in  models  based  on  continuous  maps,  such  terms 
denote  strict  maps,  and  in  models  based  on  stable  maps,  they  denote  linear  maps.  Appropriate 
constructions  for  interpreting  variants  can  be  given  in  both  cases,  such  that  {VART-CRN}  is  sound, 
as  long  as  P  ranges  only  over  strict  (or  hnear)  maps. 

Maintaining  the  same  philosophy  to  our  approach  as  in  Section  3  we  wiU  try  to  abstractly 
embody  in  TARGET  a  sufficient  amount  of  formalism  to  insure  the  provable  coherence  of  our 
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translation.  Thus,  the  previous  discussion  of  variants  leads  us  to  introduce  a  new  type  constructor 
s o-> t  ,  the  type  of  “coercions”  from  s  to  t.  Consequently,  the  coercion  assumptions  a<t  that 
occur  in  inheritance  contexts  must  translate  to  variables  ranging  over  types  of  coercions  f-.ao-^t*  . 
As  a  consequence,  the  translation  of  bounded  quantification  must  change: 

(ya<s.ty  Va.  ((ao->5*)— !-t*) 


In  order  to  express  the  correct  versions  of  {VART-CRN},  we  introduce  a  family  of  constants  in 
TARGET 


is,t  •  (5  0-+t)— i-(s— vt) 

called  coercion- coercion  combinators.  With  this,  we  have 


{VART-CRN}  i(P)(caseMof =  case  M  ol  h=>  F^;  l{P),  ..  .,1^^  Fr,;  l{P) 

where  M:  tn],  Fi\t\-^t, . . . ,  Fn'.tn^t,  P:to-^s. 

(the  complete  list  is  in  Appendix  B). 

In  order  to  translate  aU  inheritance  judgements  into  coercion  terms,  we  add  a  special  set  of  con¬ 
stants  (coercion  combinators)  that  “compute”  the  translations  of  the  rules  for  deriving  inheritance 
judgements.  To  prove  coherence,  we  ajciomatize  the  behavior  of  the  i-images  of  these  combinators. 

For  example,  the  coercion  combinator  for  the  rule  (ARROW)  takes  a  pair  of  coercions  as  arguments 
and  yields  a  new  coercion  as  value: 

arrow[5,  t,  u,  u]  :  {s  o-^  t) —>■  (u  o-^  v) -*  {{t -+ u)  o->-(s—^v)) 

Since  (ARROW)  is  a  rule  scheme^  we  naturally  have  a  family  of  such  combinators,  indexed  by 
types.  To  simplify  the  notation,  these  types  wiU  be  omitted  whenever  possible.  The  equational 
property  of  the  arrow  combinator  is  given  in  terms  of  the  coercion  coercer: 

i(arrow(P)(Q))  =  Xz:t-*u.{L{P));z;{t{Q)) 

where  P:  s  o->  t,  Q:u  o->  v.  For  the  rule  (TRANS),  we  introduce 

trans[r,  s,t]:  {r  o-^  s) {s  o-+ 1) —>■  {r  o-^  t) 

which,  of  course,  behaves  like  composition,  modulo  the  coercion  coercer: 

t(trans(P)(Q))  =  t(P);t(Q) 

where  P:  r  o-^  s,  Q\ s  o-*  t.  The  combinator  for  the  rule  (FORALL)  is  the  most  involved: 

forall[s,  t,  a,  u,  u]  :  (s  o-^  t)  — >  Vo.  ((a  o-^  s)  — >  (u  o-+  v))—>-(Va.  ((a  o— »•  i)  — »■  u)  o-*  Va.  ((a  o— »■  s)  — »•  u)) 
with  the  equational  axiomatization 

t(forall(P)(IF))  =  Az:  (Va.  (a  o-*  t) —>■  u) .  Aa.  Xf:  ao-^s.  t{W(a)(f)){z(a)(trans(f)(P))) 

where  P:so-^t,  W:Va.  (a  o->  s)  ->  (u  o-+  u).  Of  course,  we  have  gone  to  the  extra  inconvenience 
of  introducing  the  type  of  coercions  in  order  to  provide  a  satisfactory  account  of  variants.  These 
require  a  scheme  of  combinators  having  the  types: 

vart[si , . . . ,  Sp,  , . . . ,  tg]  :  (^i  o  y  )  y  •  •  ^  (^p  o—y  tp)  *■  ([Ii:  Si,  •  •  •tip-  5p]  o  y\li  .ti, . . .  ,lp.tp, . . .  ,lq.  t^]) 
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And  it  is  now  possible  to  assert  a  consistent  equation  for  these  combinators: 

i(vart(i2i)  •  ■  •  (Bp))  =  Aw:  [/i:  Si, . . . ,  Ipi  Sp].  case  w  of  /i  inj/j  ,...,lp^  i(-Rp);  inj/^ 

where  iZi:  Si  o->  , . . . ,  Bp'.  SpO-^tp  .  In  order  to  prove  equalities  between  terms  of  coercion  type 

one  uses  the  following  rule: 


{lOTA-INJ} 


^(P)  =  >^{Q) 

P  =  Q 


which  asserts  that  i  is  an  injection.  In  fact,  all  of  the  models  we  give  below  wiU  interpret  c  as  an 
inclusion.  It  is  natural  to  ask  whether  the  coercion  coercer  l  could  have  been  omitted  from  the 
calculus  in  favor  of  a  rule: 

P-.so-^t 

P-.s^t 

This  would  have  the  unfortunate  consequence  that  a  typing  judgement  e:  s  would  no  longer  uniquely 
encode  its  proof  and  the  coherence  question  would  therefore  arise  again!  The  other  combinators 
and  their  equational  properties  are  described  in  Appendix  B. 

We  are  now  ready  to  explain  how  to  translate  our  full  language  SOURCE  (complete  with 
variants)  into  the  language  TARGET  (with  the  coercion  coercer  and  combinators).  For  starters, 
the  inheritance  judgement  for  the  function  space  is  simply  translated  using  the  arrow  combinator: 


(ARROW)* 


C*  H  P  :  s*  o— *•  C*  \-  Q  :  u*  o— »•  v* 

C*  h  arrovj(P)(Q)  :  (t* -^u*)  o-*{s’' —^v”) 


The  translation  of  an  inheritance  between  quantified  types  takes  the  induced  coercion  and  a  poly¬ 
morphic  function  as  its  arguments: 


(FORALL)* 


C*  h  P  :  s*  o-^t*  C*,  a,  f:  ao-^  $*  h  Q  :  u*  o^v* 

C*  H  forall(P)(Aa.  A/:  a o->5*.  <5)  :  Va.  ((o  o-j.  t*)— )-u*)  o-»  Va.  ((a  cw s*)— >■  V*) 


Other  inheritance  judgements  are  similarly  translated.  The  real  work  is  being  done  by  equational 
properties  of  the  combinators. 

The  proofs  of  typing  judgements  are  translated  in  a  manner  quite  similar  to  how  they  were 
translated  in  the  fragment.  For  example, 


[B-SPECj* 


r*  h  M  :  Va.  ((Qc^s*)->r)  F*  f- 
r*  h  M(t*)(P)  :  [r*/a]t* 


F*  h  P  :  r'  o— >  s* 


is  affected  only  by  indicating  that  the  map  into  the  bound  must  be  a  coercion.  The  inheritance 
rule  is  translated  by 


r*  h  M  :  s*  f  •  h  P  :  s*  r 

r*  h  i(P)(M)  :  t* 

since  a  coercion  cannot  be  applied  until  it  is  made  into  a  function  by  an  application  of  the  coercion 
coercer.  The  fuU  description  of  the  translation  of  the  fuU  language  is  given  in  Appendix  C.  We  now 
turn  to  the  proof  of  the  central  technical  results  of  the  paper. 
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5  Coherence  of  the  translation  for  the  full  calculus 

In  this  section  we  prove  first  the  coherence  of  the  translation  of  inheritance  judgements.  This  result 
is  then  used  to  show  the  coherence  of  the  translation  of  typing  judgements. 

The  main  cause  for  having  distinct  derivations  of  the  same  inheritance  judgements  is  the  rule 
(TRANS).  Our  strategy  is  to  show  that  the  usage  of  (TRANS)  can  be  coherently  postponed  to  the 
end  of  derivations  (Lemma  6),  and  then  to  prove  the  coherence  of  the  translation  of  (TRANS)- 
postponed  derivations  (Lemma  8). 

We  introduce  some  convenient  notations  for  the  rest  of  this  section.  For  any  derivation  A 
in  SOURCE,  let  A*  be  the  TARGET  derivation  into  which  it  is  translated.  We  will  write 
C  h  ro  <  ■  •  •  <  instead  of  C  h  tq  <  ri  h  r„_i  <  r„  .  The  composition  of 

coercions  given  by  trans  occurs  so  often  that  we  will  write  P  Q  Q  instead  of  trans(P)(Q)  .  It  is 
easy  to  see,  making  essential  use  of  the  rule  {lOTA-INJ},  that  0  is  provably  associative.  We  wiU 
take  advantage  of  this  to  unclutter  the  notation.  We  will  also  write  I  instead  of  refi  .  Again  it  is 
easy  to  see  that  J  is  provably  an  identity  for  O  ,  that  is,  I  Q  M  =  M  Q  I  =  M  is  provable  in 
TARGET. 

Lemma  6  For  any  SOURCE  derivation  A  yielding  the  inheritance  judgement  C  \-  s  <  t  , 
there  exist  types  Tq,  . . . ,  r„  such  that  s  =  ro  ,  rn  =  t,  and  (TRANS)-free  derivations  Ai, . . . ,  A„ 
yielding  respectively 

C  f-  ro  <  •  •  •  <  r„ 

Moreover,  if  the  translations  A*,Ai,...,A*  yield  respectively  the  (coercion)  terms  C*  h 
P  :  s*  o-^t*  ,  C*  h  Pi  :  Tq  o-H.  rj;  ,. . .,  C*  h  P„  :  r*_i  cw  r*  then 

C*  h  P  =  PiQ---GPn 


is  provable  in  TARGET. 

Proof:  By  induction  on  the  height  of  the  derivation  A.  The  base  is  trivial  since  derivations 
consisting  of  instances  of  (TOP),  (VAR),  or  (REFL)  are  already  (TRANS)-free.  We  present  the 
more  interesting  cases  of  the  induction  step. 

Suppose  A  ends  with  an  application  of  (ARROW).  By  induction  hypothesis  there  are  (TRANS)- 
free  derivations  for 


s  —  To  <  •  •  •  <  rm  =  t  and  u  ~  wq  <  •  •  ■  <  Wn  =  v 

(for  simplicity,  we  omit  the  context).  From  these,  using  (REFL)  and  (ARROW)  we  get  (TRANS)- 
free  derivations  for 

t^U  =  rm—*U  <  •••  <  Vq-^U  =  S—^Wq  <  •••  <  S-^Wn  =  S~^V  . 

(This  is  not  most  economical:  one  can  get  a  derivation  requiring  only  max;(m,n),  rather  than 
m  +  n,  steps  of  (TRANS)  at  the  end.)  Proving  the  equality  of  the  corresponding  translations  uses 
the  associativity  of  Q  and  the  fact  that  I  acts  like  an  identity,  as  well  as 

(1)  arrow(P)(Q)  Q  arrow(P)(S)  =  arrow(P  ©  P)(Q  ©  S) 

which  can  be  verified,  in  view  of  {lOTA-INJ},  by  applying  t  to  both  sides,  resulting  in  a  simple 
{BETA}-conversion. 
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Suppose  A  ends  with  an  application  of  (FORALL).  By  induction  hypothesis  there  are  (TRANS)- 
free  derivations  for 


C  h  s  =  ro  <  •••  <  r-m  ^  t  and  C,a<s  h  u  =  Wq  <  •••  <  Wn  =  v 
From  these,  using  (REFL)  and  (FORALL)  we  get  (TRANS)-free  derivations  for 
C  h  ^a<t.u  =  'ia<rra-u  <  ■■•  <  'ia<ro.u  =  ya<s.u  =  '^a<s.Wo  <  •••  <  Va < =  Vn < s.n 

Proving  the  equality  of  the  corresponding  translations  uses 

(2)  forall(P)(Aa.  A/:  a  o-^  s.  Q)  O  forall(iJ)(Aa.  A^:  a  5)  = 

=  {ora\\{RQ  P)(Aa.  Xg:ao-^t.[g  Q  R/  f]Q  Q  S) 

and  which  can  be  verified  by  applying  i  to  both  sides. 

Suppose  A  ends  with  an  application  of  (VART).  By  induction  hypothesis  there  are  (TRANS)- 
free  derivations  for 

si=rl  <  ■■■  < 


Sp  =  r^  <  ■■■  <  rP^  =  ip 

(for  simplicity,  we  omit  the  context).  From  these,  using  (REFL)  and  (VART)  we  get  (TRANS)-free 
derivations  for 


[/i:  Si, . . . ,  fp;  Sp]  _  [/i:  Tq,  . . . , /p:  Sp]  <  ^  [^i- •  •  •  >  ^p]  —  ^  [^i  • ’'ni  >  •  •  •  >  ^p’ ^o]  — 

^  [h-  f  •  •  •  y  ^P'  ^np]  —  [^1  •  ^1?  •  •  •  )  ^P'  ^p]  —  [h  -  ,  .  .  .  ,  Ip',  tp,  .  .  .  ,  Igl  tgj  . 

Proving  the  equality  of  the  corresponding  translations  uses 


(3)  vart(Pi)---(Pp)0vart(<3i)---(g,)  =  vart(Pi  ©  Qi)  •  •  •  (Pp  ©  Qp)  (p  <  g). 

To  verify  this,  let  Z  be  the  left  hand  side  of  the  equation,  R  the  right  hand  side  and  let  in  be  a 
fresh  variable.  By  extensionality  (or  {ETA}  and  {XI})  and  by  {IOTA-IN J},  it  is  sufficient  to  show 
i{L){w)  =  l{R){w).  By  {VART-COP},  this  follows  from 


case  in  of  /i  ^  (inj;^ ;  l{L)),  ...,lp=>  (inj;^;  i(T))  =  case  in  of  /i  =>■  (inj;^ ;  i(P)),  ...,lp=>  (inp^;  i(P)) 
which  is  readily  verified. 

When  A  ends  with  (TRANS),  we  just  concatenate  the  chains  of  (TRANS)-free  derivations  and 
the  equality  of  the  translations  is  an  immediate  consequence  of  the  associativity  of  ©.  | 


The  following  is  used  to  handle  one  of  the  cases  in  Lemma  8  below. 

Lemma  7  For  any  two  derivations,  A  yielding  C  h  s  <  t  and  ©  yielding  C,a  <  t  h 
u  <  V  ,  there  exists  a  derivation  S  yielding  C,a  <  s  h  u  <  v  such  that  height{T,)  = 
m.ax{height(A),  height{Q))  .  Moreover,  if  the  translations  A*,  yield  respectively 

C*  b  P  :  s*  0-+  P  ,  C*,  a,  g:  ao-y-t*  h  Q  :  u*o—>-v*  ,  C*,  a,  /;  a  t>->  s*  h  R  :  u*  o—>- v* 


then 

C\a,  f'.acy^s*  h  P  =  [fQP/g]Q 

is  provable  in  TARGET. 
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Proof;  By  induction  on  the  height  of  0.  | 


Lemma  8  Let  Ai, . . be  (TRANS)-free  derivations  in  SOURCE  yielding  respectively  C  h 
•so  <  •  •  •  ^  ■5m  and  Qi, ..  .,Qn  be  (TRANS)-free  derivations  yielding  respectively  C  i-  to  < 
■  •■  <  tn  ■  Let  the  translations  Aj, . . . ,  A^,  0j, . . . ,  0*  yield  respectively  the  (coercion)  terms 


C*  h  Pi  :  4  c>->  sj  , . 


C*  h  P„ 


1 


If  5q  —  to  and  SjYi  —  t^i  then 


C*  h  Pi  ©•••©Pm  =  QiQ---QQn 


is  provable  in  TARGET. 

Proof:  We  begin  with  the  following  remarks: 

•  If  one  of  SQ,...,Sm,tQ,...,tn  is  Top  then  the  desired  equality  holds.  Indeed,  then  Sm  = 
Top  =  tn  and  the  equality  follows  from  the  identity 

(4)  P  <  top 

which  is  verified  by  applying  l  to  both  sides  (recall  that  1  is  a  terminator). 

•  Those  derivations  among  Ai, . . . ,  A^,  ©i,  •  •  • ,  ©n  which  consist  entirely  of  one  application  of 
(REFL)  can  be  eliminated  without  loss  of  generality.  Indeed,  the  corresponding  coercion  term 
is  I  which  acts  as  an  identity  for  ©. 

•  If  none  of  the  derivations  among  Ai, . . .,  Am?  ©i»  •  •  •>  ©n  consists  of  just  (TOP),  then  those 
derivations  which  consist  of  just  (VAR)  can  also  be  eliminated  without  loss  of  generality. 
Indeed,  once  we  have  eliminated  the  (REFL)’s,  the  (VAR)’s  must  form  an  initial  segment  of 
both  Ai, . . . ,  Am  and  0i, . . . ,  0„  because  whenever  s  <  a  is  derivable,  s  must  also  be  a 
type  variable.  Let’s  say  that  sq  =  ao,---,Sp  =  Op-i  ,  {p  <  rn),  where  Ai , . . . ,  Ap  are  all  the 
derivations  consisting  of  just  (VAR),  and  also  that  to  =  bo,.-.,tq  =  6g_i  ,  {q  <  n),  where 
01 , . . . ,  0g  are  all  the  derivations  consisting  of  just  of  (VAR).  Then,  ao<ai, . . ap_i  <  Sp  as 
weU  as  bo<bi,. .  .,bg-i<tg  must  all  occur  in  C.  But  ao  =  sq  =  to  =  bo  so  by  the  uniqueness 
of  declarations  in  contexts,  ai  =  6i,. . etc.  Suppose  p  <  q.  Then,  Sp  =  6p  is  a  variable.  Since 
Ap+i  can’t  be  just  a  (REFL)  or  a  (TOP)  is  must  be  a  (VAR)  contradicting  the  maximality 
of  p.  Thus  p  =  q  and  Sp  =  tg  and  the  (VAR)’s  can  be  eliminated. 

We  proceed  to  prove  the  lemma  by  induction  on  the  maximum  of  the  heights  of  the  derivations 
Ai, . . .,  Am,  ©1,  •  •  ■  7  ©n-  The  basis  of  the  induction  is  an  immediate  consequence  of  the  remarks 
above. 

For  the  induction  step,  in  the  view  of  the  remarks  above,  we  can  assume  without  loss 
of  generality  that  none  of  the  derivations  is  just  a  (TOP),  (VAR),  or  (REFL).  Consequently, 
Ai, . . .,  Am,  ©1,  •  •  • ,  ©71  must  all  end  with  the  same  rule,  depending  on  the  type  construction  used 
in  So  =  to  . 

If  aU  derivations  end  in  (ARROW),  the  desired  equality  follows  from  the  induction  hypothesis, 
the  associativity  of  ©  and  the  equation  (1).  Similarly  for  (VART)  using  the  equation  (3).  The 
desired  equality  in  the  case  (FORALL)  follows  from  the  induction  hypothesis  using  Lemma  7,  from 
the  associativity  of  ©  and  from  the  equation  (2).  The  remaining  cases  are  straight-forward.  | 
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This  gives  us  the  coherence  of  the  translation  of  inheritance  judgements.  To  state  it  we  need 
some  terminology.  We  say  that  two  SOURCE  derivations  which  yield  the  same  judgement  are 
congruent  if  their  translations  in  TARGET  yield  provably  equal  terms.  We  will  write  Ai  =  A2 
for  congruence  of  derivations.  It  is  easy  to  check  that  =  is  in  fact  a  congruence  with  respect  to  the 
operations  on  derivations  induced  by  the  rules. 

Lemma  9  (Coherence  of  the  translation  of  inheritance)  If  Ai  and  A2  are  two  SOURCE 
derivations  yielding  the  same  inheritance  judgement  then  Ai  =  A2  (their  translations  yield 
provably  equal  terms  in  TARGET^. 

Proof:  Immediate  consequence  of  Lemmas  6  and  8  | 

Before  we  turn  to  the  coherence  of  the  translation  of  typing  judgements,  we  will  note  a  few  facts 
about  inheritance  judgements  that  foUow  from  Lemma  6  and  that  will  be  invoked  subsequently. 
These  facts  are  closely  related  to  the  remarks  opening  the  proof  of  Lemma  8. 

Remark  10  7/  C  h  s  <  t  is  derivable,  s  =  a  ,  a  type  variable,  and  t  ^  a  then 

•  if  t  =  b  ,  also  a  type  variable,  there  must  exist  type  variables  oq,  . .  .,an  ,  n  >  1  such  that 
a  =  oq  ,  b  =  a-n  ,  and  aj_i  <0,  6  C  ,i  =  1, . . . , n  ; 

•  if  t  is  not  a  type  variable,  there  must  exist  type  variables  uq,  . . . ,  an  ,  n  >  0  and  a  type  u 

such  that  a  =  ao  ,  a,_i  <  a,-  G  C  ,  i  =  1, . . . ,  n  ,  an<u  £  C  ,  and  C  u  <  t  (of  course, 

this  is  trivial  when  t  =  Top  ); 

If  C  s  <  t  is  derivable,  and  s  is  not  a  type  variable  variable,  then  t  cannot  be  a  type  variable, 
and  if  moreover  t  ^  Top  ,  then  s  and  t  must  both  have  the  “same”  outermost  type  constructor  (as 
detailed  exhaustively  below)  and 

•  if  s  =  Si -^S2  and  t  =  ti—*t2  then  C  ti  <  si  and  C  $2  <  t2  ; 

•  z/s  =  {/i:  Si, 5,}  and  t  =  {li'.ti, . . .  ,lp:  tp}  then  p  <  q  and  C  h  si  <  ti  ,  ,C  h 

c  ^  * 

Op  ^  t'p  ) 

•  if  s  =  'ia<si.  S2  and  t  =  \fa<ti.t2  then  C  ti  <  Si  and  C,a<ti  h  ^2  <  ^2  / 

•  if  s  and  t  are  both  recursive  types  then  they  must  be  identical; 

•  if  s  =  [li:  si, ...  ,lp:  Sp]  and  t  =  [/i:  ti,  t,]  then  p  <  q  and  C  h  si  <  ti  ,  . . .  ,C  h 

'Sp  ^  tp  , 

We  turn  now  to  the  coherence  of  the  translation  of  typing  judgements,  which  is  the  central 
technical  result  of  the  paper.  As  explained  in  section  3,  we  weaken  the  system  by  replacing  the  rule 
(FORALL)  with  (W-FORALL)  (see  Appendix  A).  With  this,  we  have  the  following  order-theoretic 
property  about  the  inheritance  judgments,  which  fails  in  the  presence  of  (FORALL).  The  property 
asserts  the  existence  of  conditional  greatest  lower  bounds  and  of  least  upper  bounds. 

Lemma  11  Replace  (FORALL)  with  (W-FORALL).  Let  C  be  an  inheritance  context  and  let  <1,^2 
be  types. 

1.  If  there  is  an  r  with  C  h  r  <  t,-  ,  (z  =  1,2)  ,  then  there  exists  a  type  ti  FI  t2  such  that 
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•  C  ti  n  t2  <  t{  ,  (i  =  1,2)  and 

•  for  any  5  such  that  C  1-  s  <  ,  (i  =  1,2)  we  have  C  h  s  <  □  i2  •  I 

2.  There  is  a  type  <1  U  such  that 

•  C  ti  <  tiUt2  ,  (i  =  1,2)  and 

•  for  any  s  such  that  C  h  ti  <  s  ,  (i  =  1,2)  we  have  C  h  /i  U  ^2  ^  >s  •  I 

Proof:  Because  of  the  contravariance  property  of  the  first  argument  of  the  function  space  operator 
manifest  in  the  rule  (ARROW),  we  will  prove  items  1  and  2  simultaneously.  In  view  of  Lemma  6,  it 
is  sufficient  to  work  with  proofs  where  all  instances  of  (TRANS)  appear  at  the  end.  Since  moreover 
any  two  types  have  a  common  upper  bound.  Top,  the  statement  of  the  lemma  is  equivalent  to  the 
following  formulation: 

For  any  Ai, Am,  (TRANS)-free  derivations  in  SOCRCE,  yielding  respectively  C  uq  < 

<  Um  and  any  Qi, . . .  ,Qn,  (TRANS)-free  derivations  yielding  respectively  C  h  vq  < 

1.  if  Uq  =  vq,  and  let  ti  =  Um  and  ^2  =  then  there  is  a  type  ti  PI  t2  having  the  properties  in 
item  1  of  the  lemma; 

2.  if  Urn  =  h  =  ^0  ond  t2  =  vq,  then  there  is  a  type  t\  U  t2  having  the  properties  in 

item  2  of  the  lemma. 

This  is  shown  by  induction  on  the  maximum  of  m,  n  and  of  the  heights  of 
Ai, . . .,  Am,  01,  •  •  • ,  ©n-  To  be  able  to  apply  the  induction  hypothesis,  a  case  analysis  is  per¬ 
formed,  depending  on  the  structure  of  ti  and  ^2-  We  will  only  look  at  a  few  illustrative  cases. 
The  facts  listed  in  Remark  10  and  the  reasoning  that  produced  these  facts  as  well  as  the  remarks 
opening  the  proof  of  Lemma  8  are  used  throughout. 

For  example,  if  ti  is  a  type  variable  in  item  1,  then  Ui  is  also  a  type  variable  for  each  z,  and 

<  ti,-  €  C  ,  z  =  1, . . .,  n  .  Then,  one  of  C  uq  <  •  •  ■  <  Um  or  C  H  uq  <  •  •  •  <  u,,  , 
must  be  an  initial  segment  of  the  other,  so  ti  and  <2  are  comparable  and  ti  n  t2  can  be  taken  as 
the  smaller  among  them.  For  item  2,  if  ti  is  a  type  variable,  then  uo<ui  G  C  and,  by  induction 
hypothesis  (m  decreases),  ti  U  t2  can  be  taken  to  be  Ui  U  t2- 

As  another  example,  suppose  that  in  item  1  ti  has  the  form  Va  <  s.  ri.  If  uq  =  uq  is  a 
type  variable,  then  uq  <  ui  £  C  and  vq  <  Vi  ^  C  hence  Ui  =  vi  and  we  can  apply  the 
induction  hypothesis  by  eliminating  Ai,0i.  Assume  that  uq  =  vq  is  not  a  type  variable.  By 
Remark  10  (simplified  to  take  into  account  the  weakening  of  (FORALL)),  it  must  have  the  form 
Va  <  s.  r.  Again  by  Remark  10  t2  is  either  Top  or  has  the  form  Va  <  s.  r-2.  If  ^2  =  Top  then 
ti  n  t2  can  be  taken  to  be  ti-  Otherwise,  there  are  (TRANS)-free  derivations  A'^, . . .,  A^,^  yielding 
C,a<s  h  Uq  <  <  zz(,j  and  0'i ,..., 0(,  yielding  respectively  C,a<u  h  Vq  <  <  v!^ 

where  Uq  ~  Vq  and  u'm  =  ri  and  v'^  =  r 2,  and  where  each  of  these  derivations  has  strictly  smaller 
height  than  the  corresponding  one  among  Ai, . . . ,  Am,  0i,  •  •  • ,  ©n-  By  induction  hypothesis  we  get 
a  type  ri  n  r2,  and  we  can  then  take  t\  H  t2  to  be  Va  <  s.  ri  n  r2.  This  calculation  makes  clear  where 
our  proof  breaks  down  if  we  were  to  use  the  more  general  rule  (FORALL)  instead  of  (W-FORALL). 
Indeed,  if  the  bounds  on  the  type  variables  were  allowed  to  differ,  as  in  the  more  general  case,  we 
would  be  unable  to  apply  the  induction  hypothesis  since  the  two  contexts  would  differ  between  the 
0’s  and  the  A’s. 

We  omit  the  remaining  cases,  which  use  similar  ideas.  | 
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We  will  use  this  property  in  the  proof  of  Lemma  12,  which  is  a  shghtly  stronger  result  than 
the  actual  coherence  of  the  translation  of  typing  judgements.  Of  course,  the  strengthening  is 
exploited  in  a  proof  by  induction.  First  we  introduce  a  definition  and  more  convenient  notations. 
For  derivations  yielding  typing  judgements  we  define  the  essential  height  which  is  computed  as  the 
usual  height,  with  the  proviso  that  [INH]  and  the  rules  yielding  inheritance  judgements  do  not 
increase  it.  We  will  also  use  a  special  notation  for  describing  “composition”  of  derivations  via  the 
rules.  We  explain  this  notation  through  two  examples.  If  S  yields  F  h  e  :  5  and  0  yields 
f  h  s  <  t  ,  then  [INH]  (  E  ,  0)  yields  F  h  e  :  t  .  If  A  yields  F,  2:;  s  h  e  :  t  then  [ABS]  ( A ) 
yields  F  h  Xx:  s.  e  :  s^t  . 

In  preparation  for  the  proof  of  the  next  lemma,  we  have  two  remarks. 

•  We  have  the  following  congruence 

[INH]  ( [INH]  (  S  ,  0i) ,  02)  =  [INH]  (  S  ,  (TRANS)  (  0i ,  02))  . 

This  follows  from  the  fact  that  l(Q)(l{P)(M))  =  l{PQQ){M)  which  is  immediately  verified. 

•  Any  SOURCE  derivation  is  congruent  to  a  derivation  of  the  form  [INH]  (  A  ,  0)  where  A 
does  not  end  with  an  application  of  the  [INH]  rule.  This  follows  from  the  previous  remark 
and,  in  the  case  when  the  original  derivation  did  not  end  in  [INH],  from 

A  S  [INH]  ( A  ,  (REFL)) 

which  in  turn  follows  from  M  = 

Lemma  12  Replace  (FOR ALL)  with  (W-FORALL).  For  any  two  SOURCE  derivations,  A, 
yielding  F  h  e  :  L  ,  (i  =  1,2)  ,  there  exists  a  type  s,  a  derivation  E  yielding  F  h  e  :  5  and  two 
derivations  0,-  yielding  T  \-  s  <  ti  ,  (i  =  1,2)  such  that 

AiS[INH](E,0.),  (z  =  l,2). 

Proof:  By  induction  on  the  maximum  of  the  essential  heights  of  Ai,  A2.  In  view  of  the  previous 
remarks,  it  is  sufficient  to  prove  the  statement  of  the  lemma  assuming  that  neither  Ai  nor  A2 
ends  in  [INH]  (but  we  retain  the  actual  statement  of  the  lemma  in  the  induction  hypothesis).  For 
such  derivations,  Ai  and  A2  must  end  with  the  same  rule  (which  rule,  depends  on  the  structure  of 
e).  We  do  a  case  analysis  according  to  this  last  rule,  and  we  include  here  only  the  cases  which  we 
believe  are  important  for  the  understanding  of  the  result  (even  if  their  treatment  is  straightforward) 
as  well  as  some  cases  which  are  particularly  complex.  We  will  call  the  type  s  ,  whose  existence  is 
the  essence  of  the  result,  the  common  type. 

Rule  [VAR].  It  must  be  the  case  that  ti  =  t2  =  r  where  x:r  occurs  in  F.  Consequently,  the 
treatment  of  this  rule  is  trivial:  take  the  common  type  to  be  r  ,  E  =  [VAR]  ,  and  0i  =  02  = 
(REFL)  . 

The  introduction  rules  are  quite  simple  and  we  illustrate  them  with  the  rule  [ABS].  Suppose 
that  Ai  =  [ABS]  (  A  - )  and  that  A,-  yields  F  F  Ax:  s.  e  :  s-^ti  {s  is  the  same  since  it  appears  in 
the  term),  thus  A]  yields  F,  x:s  F  e  :  ti  ,  {i  =  1,2)  .  Apply  the  induction  hypothesis  to  Aj,A2 
obtaining  r,  E',  0^,  02-  Also  by  induction  hypothesis, 

Ai^[ABS]([INH](E',0:-)),  (f  =  l,2). 

We  claim  that  the  right  hand  side  is  congruent  to 

[INH]  ( [ABS]  (  E' ) ,  (ARROW)  ( (REFL) ,  0'))  . 
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This  implies  that  the  statement  of  the  lemma  holds  for  Ai,  A2,  with  common  type  s  -y  r  ,  with 
E  =  [ABS](S')  ,  and  with  0i  =  (ARROW)  ( (REFL) ,  0(),  {i  =  1,2).  The  congruence  claim 
follows  from 

Xx:  s.  l{P){M)  =  <,(arrow(I)(P)(Ax:  s.  M) 

which  is  readily  verified. 

Rule[B-SPEC].  To  simplify  the  notation,  we  omit  the  contexts.  Suppose  that  Ai  = 
[B  -  SPEC]  ( A( ,  Hi)  and  that  Ai  yields  e(r)  :  [r/a]ti  (r  is  the  same  since  it  appears  in  the 
term  and  we  can  take  the  bound  variable  to  be  the  same  without  loss  of  generality),  thus  A(  yields 
e  :  'iaKsi-ti  and  Hi  yields  r  <  Si  ,  (i  =  1,2)  .  Apply  the  induction  hypothesis  to  Ai,A2 
obtaining  w,  S',  0'i,  02-  Also  by  induction  hypothesis, 

(5)  Ai  ^  [B  -  SPEC]  ( [INK]  ( S' ,  0') ,  H,)  ,  (i  =  1, 2). 

Since  in  <  Va  <  Si.  ti  ,  (i  =  1,2)  it  follows  from  Remark  10  (simplified  to  take  into  account 
the  weakening  of  (FORALL))  that  there  must  exist  types  u,v  such  that  Si  =  u  ,  a<Si  h  v  < 
ti  ,  {i  =  1,2)  and  w  <  'ia<u.v  are  derivable.  It  follows  that  r  <  u  ,  and,  by  Lemma  7,  that 
a<r  h  V  <  ti  ,  (i  =  1,2)  are  derivable.  Next,  we  will  use  the  following  sublemma: 

Sublemma  For  any  derivation  A  yielding  C,a<r  h  s  <  t  there  exists  a  derivation 
S  yielding  C  h  [r/a]s  <  [r/a]t  such  that,  if  the  translations  A*,  S*  yield  respectively 

C*,  a,  /:  G  0-+  r*  h  P  :  s*  t*  ,  C*  b  Q  :  [r*/a]5*  o~y[r*/a]t* 


then 

C*  \-  Q  =  (Aa.A/:ao-vr*.P)(r*)(J) 
is  provable  in  TARGET.  | 

The  sublemma  is  proved  by  induction  on  the  height  of  A  and  is  omitted.  The  sublemma  allows  us 
to  obtain  [r/a]u  <  [r/a]ti  from  a<r  b  v  <  ti  ,  (i  =  1,2)  .  Let  0i  be  some  derivation  of 
[r/a]v  <  [r/a]ti  ,  (i  =  1,2)  .  Let  H  be  some  derivation  of  r  <  u  .  Let  ft  be  some  derivation  of 
w  <  ya^u-v.  One  can  readily  verify  that  the  right  hand  side  of  (5)  is  congruent  to 

[INH]  { [B  -  SPEC]  ( [INK]  ( S' ,  fi) ,  H) ,  0,) 

This  implies  that  the  statement  of  the  lemma  holds  for  Ai,  A2,  with  common  type  [r/a]v  ,  with 
S  =  [B  —  SPEC]  ( [INH]  ( S' ,  fl) ,  H)  ,  and  with  0^  being  just  0,-,  {i  =  1,2).  (Note.  There  is 
no  difficulty  in  dealing  with  (FORALL)  instead  of  (W-FORALL)  here:  Sj  =  u  would  be  simply 
replaced  by  s,-  <  u  .) 

Rule[R-ELIM].  Suppose  that  A^  =  [R— ELIM](A[)  and  that  A,-  yields  P  b 
elim  e  :  [pai.  ti/ ai]ti  ,  thus  A[  yields  L  b  e  :  pai.ti  ,  (z  =  1,2);.  Apply  the  induction 
hypothesis  to  Aj,  A'2  obtaining  s'.  S', O'j, 02-  Also  by  induction  hypothesis, 

A.-  ^  [R  -  ELIM]  ( [INH]  ( S' ,  0') )  ,  {i  =  1, 2). 

Since  s'  <  pai-  ti  ,  (z  =  1,2)  are  derivable,  it  follows  from  Remark  10  that  there  must  exist  a,t 
such  that  pai.ti  =  pa.t  ,  (z  =  1,2)  and  s'  <  pa.t  are  derivable.  Let  0'  be  any  derivation  of 
s'  <  pa.t  .  Since  by  Lemma  9,  0'i  =  0'2  =  0'  ,  the  statement  of  the  lemma  holds  with  common 
type  [pa.  t/a]t  ,  with  S  =  [R  -  ELIM]  ( [INH]  ( S' ,  0') )  ,  and  with  0^  =  (REFL)  ,  (z  =  1, 2). 
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Rule[CASE].  Again,  to  simplify  the  notation,  we  omit  the  contexts.  Suppose  that  A,-  = 
[CASE]  (  A ■ ,  A^,- , . . . ,  A^,)  and  that  Ai  yields  case  e  of  /i  =>  /i :  t,-  ,  thus  A-  yields 
e  :  [li'-hi,  tni]  ,  and  A',-  yield  fj  :  tji^U  ,  {j  =  1, . . .,  n),  {i  =1,2).  Apply  the  induction 

hypothesis  to  A|,A2  obtaining  s,S', 0'i,02.  Also  apply  the  induction  hypothesis,  to  A)j,A'2 
obtaining  sj,  S),  0)i,  0)2  ,  (j  =  E  •  •  • ,  •  By  induction  hypothesis, 

(6)  A,  ^  [CASE]  ( [INH]  ( S' ,  0') ,  [iNH]  ( s; ,  0;,) , . . .,  [iNH]  ( s; ,  0;,))  ,  (z  =  1, 2). 

Since  s  <  [E:  tii,  t„,]  ,  (z  =  1,2)  are  derivable,  it  follows  again  from  Remark  10  that 

there  must  exist  m<n  and  types  such  that  rj  <  <  tmi  ,  (*  =  1,2) 

and  s  <  are  derivable.  Again  similarly,  for  each  of  j  =  1, ...  ,n,  ,  since 

Sj  <  tji  ti  ,  (z  =  1,2)  are  derivable,  there  must  exist  Uj,Vj  such  that  tji  <  Uj 

and  Vj  <  C'  ,  (z  =  1,2)  as  well  as  sj  <  uj  vj  are  derivable.  Thus,  we  can  derive 
Tj  <  tji  <  Uj  ,(j  =  1, ... ,  n),  (z  =  1,2).  However,  the  fact  that  the  Uj’s  may  be  distinct  causes 
a  problem  when  we  want  to  apply  [CASE].  This  is  resolved  by  Lemma  11.  Since  n  >  1  ,  there 
exists  a  common  lower  bound  of  ti  and  t2  (say  uj)  hence  v  =  ti  FI  t2  exists  and  we  can  derive 

Vj  <  V  <  ti  ,(j  =  l,...,n),  (z  =  1,2)  .  We  conclude  that  there  exists  a  derivation  0"  of 

s  <  [/j:  Ui,  u„]  ,  that  there  exist  derivations  0"  of  Sj  <  >  u  ,  (j  =  1, . . . ,  n)  and  that 

there  exist  derivations  0,-  of  u  <  t,-  ,  (z  =  1,2)  .  With  these,  we  claim  that  the  right  hand  side 
of  (6)  is  congruent  to 

[INH]  ( [CASE]  ( [INH]  (  E' ,  0") ,  [INH]  ( ,  0(') , . . . ,  [INH]  (  S;  ,  0")) ,  0,)  , 

This  implies  that  the  statement  of  the  lemma  holds  for  Ai,A2,  with  common  type  v  ,  with 
S  =  [CASE]  ( [INH]  (  S' ,  0") ,  [INH]  (  E'^ ,  0'/) , . . . ,  [INH]  (  S;  ,  0"))  ,  and  with  0,-  being  just  0„ 
(z-=  1,2). 

To  prove  the  congruence  claim  we  introduce  notations  for  certain  derivations  of  inheritance 
judgements  whose  existence  we  have  established.  For  each  j  =  1, .  .  .,n  ,  z  =  1,2  ,  let  Eji  be  some 
derivation  for  tji  <  Uj  .  Then,  (ARROW)  ( Ej,- ,  0,)  is  a  derivation  for  Uj—>-v  <  tji^U  .  By 
Lemma  9  we  have 

(7)  0' •  ^  (TRANS)  (  0;' ,  (ARROW)  ( E,.- ,  0,)) 

Let  E  be  some  derivation  of  s  <  [/i:  ri, . . . ,  Im'  r’m]  •  For  each  j  =  1, . . . ,  m  ,  z  =  1, 2  ,  let  flj,-  be 
some  derivation  for  Tj  <  tji  .  By  Lemma  9  we  have 

(8)  0'  =  (TRANS)  ( E ,  (VART)  ( fii.- , . . . ,  )) 


and 

(9)  0"  ^  (TRANS)  ( E  ,  (TRANS)  ( (VART)  ( fii,- , . . . ,  ) ,  (VART)  ( Ei,- , . . . ,  E„.- )))  . 

With  these,  the  congruence  claim  follows  from 

case  z(POvart((5i)  •  of  E  =>  i(RiOarrow(5i)(r))(Ei), . . . ,  i(R„0arrow(5„)(T))(F„) 

=  i(r)(case  i(P0vart((5i)  •  ■  ■(Qm)0vart(5i)  •  of  =>  i(Ei)(Fi), . . . , t(P„)(P„))  . 

By  (3)  and  {VART-CRN}  the  right  hand  side  equals 

case  i(P  O  vart((5i  0  5i)  •  --(Qm  0  5m))(M)  of  E  =>  i(Pi)(Pi);  i,(T), .  i(P„)(P„);  i(T) 


and  the  equality  is  readily  verified.  | 
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Theorem  13  (Coherence)  Replace  (FORALL)  with  (W-FORALL).  If  Aj  and  A2  are  two 
SOURCE  derivations  yielding  the  same  typing  judgement  then  Aj  =  A2  (their  translations 
yield  provably  equal  terms  in  TARGET ). 

Proof:  Take  ti  =  t2  in  Lemma  12.  By  Lemma  9,  0i  =  02  •  It  follows  that  Aj  =  A2  .  | 

6  Models 

So  far  we  have  not  actually  given  a  model  for  the  language  SOURCE.  In  this  section  we  correct 
this  omission.  However,  it  is  a  central  point  of  this  paper  that  there  is  basically  nothing  new  that 
we  need  to  do  in  this  section,  since  calculi  satisfying  the  equational  theory  of  TARGET  have  been 
thoroughly  studied  in  the  literature  on  the  semantics  of  type  systems.  Domain-theoretic  semantics 
suggests  natural  candidates  for  a  special  class  of  maps  with  the  properties  needed  to  interpret  the 
operators  and  o-+.  Here  we  present  hst  some  of  these  semantic  solutions;  aU  of  which  apply 
to  abstract  types  as  weU  as  to  variants.  A  syntactic  version  could  also  be  given  by  a  syntactic 
translation  into  an  extension  of  the  target  calculus  of  section  2,  which  expresses  the  properties 
mentioned  above  and  the  consistency  of  which  is  ensured  by  our  semantic  considerations. 

The  domain- theoretic  interpretations  that  we  have  examined  so  far  are  summarized  in  the 
following  table.  The  necessary  properties  for  all  but  the  last  row  can  be  found  in  [TT87,  HP89b], 
[CGW89],[ABL86],  [CGW87],  and  [Gir87]  respectively.  The  properties  needed  for  the  last  row  can 
be  checked  in  a  manner  similar  to  [Gir87]. 


TYPES 

TERMS 

COERCIONS 

VARIANTS 

Algebraic  lattices 

continuous  maps 

bistrict  maps 

sep  sum  of  lattices 

Scott  domains 

strict  maps 

separated  sums 

Finitary  projections 

dl  domains 

stable  maps 

strict  stable  maps 

coherent  spaces 

linear  maps 

!A©!5 

dl  domains 

By  a  bistrict  map  of  lattices  we  mean  a  continuous  map  which  preserves  both  bottom  and 
top  elements.  A  separated  sum  of  lattices  L  and  M  is  the  disjoint  sum  of  L  and  M  together 
with  new  top  and  bottom  elements.  Note  that  the  category  of  Scott  domains  (finitary  projections, 
respectively)  and  strict  maps  does  have  finite  coproducts,  given  by  coalesced  sums  of  domains,  and 
this  implies  that  the  required  equation 

{VART-CRN?}  P(case  M  of /i=>Ei,...,fn=J^E„)  =  case  M  of  fi  Pi; P,  P 

holds  if  P  is  a  strict  map  (in  fact,  a  separated  sum  of  domains  A  and  B  is  just  the  coalesced  sum 
of  the  lifted  domains  A±  and  Px)-  Furthermore,  it  may  be  checked  that  strictness  is  preserved  by 
the  formation  of  coercion  maps  from  given  ones  according  to  the  coercion  rules  given  in  section  3 
and  at  the  beginning  of  this  section.  This  model  satisfies  also  {VART-BETA}-1-{VART-ETA}.  An 
important  property  used  in  the  case  of  Scott  domains  (finitary  projections,  respectively)  is  that  the 
continuous  maps  from  C  to  D  are  in  one-to-one  correspondence  with  the  strict  maps  from  C±  to 
D.  Analogous  remarks  hold  for  stable  maps  and  linear  maps,  with  !C  instead  of  C±  (see  [Gir89], 
Chapter  8). 

From  a  category-theoretic  point  of  view,  the  main  point  is  that  we  are  dealing  with  two  cate¬ 
gories,  one  a  reflective  subcategory  of  the  other,  i.e.  the  inclusion  functor  has  a  left  adjoint.  The 
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subcategory  contains  all  objects  of  the  larger  category.  While  the  larger  category  is  cartesian  closed, 
the  reflective  subcategory  (in  which  our  coercions  live)  does  have  coproducts. 

From  a  proof-theoretic  point  of  view,  it  is  interesting  to  note  that  our  solution  is  similar  to  the 
treatment  of  proof-theoretic  commutation  rules  for  disjunction  (see  [Tro73],  4.1.3,  on  page  279  for  a 
presentation  of  commutation  rules).  The  so-called  commutation  rules  for  sums  in  proof  theory  are 
closely  related  to  the  equations  {VART-CRN?}  where  P  is  an  “evaluation”  map  (see  the  Appendix 
B  of  [Gir88]). 

7  Conclusions  and  directions  for  further  investigation 

The  development  of  calculi  for  the  representation  of  inheritance  polymorphism  and  the  semantics 
of  such  calculi  is  a  growing  and  dynamic  area  of  research  investigation  in  programming  languages. 
We  expect  that  the  calculi  considered  in  this  paper  are  only  a  small  sample  of  what  is  yet  to 
be  developed.  In  this  section  we  will  speculate  on  a  few  of  the  most  important  directions  for 
further  development  which  will  play  a  significant  role  in  future  work  of  the  authors  of  this  paper 
in  particular  and  the  research  community  in  general. 

Partial  Equivalence  Relations.  Much  of  the  research  on  the  semantics  of  the  system  which  we 
have  considered  has  been  based  on  the  use  of  PER’s  as  described  by  Bruce  and  Longo  [BL88].  It  is 
therefore  worthwhile  to  compare  the  approach  in  this  paper  to  this  alternative  approach.  There  is  an 
evident  means  of  carrying  out  a  technical  comparison:  since  the  PER  model  interprets  the  calculus 
TARGET,  it  also  interprets  SOURCE  via  our  translation.  But  the  semantics  in  [BL88]  gives  the 
interpretation  (without  recursion)  directly  using  PER’s.  Could  these  two  interpretations  be  the 
same?  For  a  certain  fragment  of  SOURCE  (including  recursion  but  not  bounded  quantification), 
Cardone  has  recently  answered  the  question  in  the  affirmative  for  his  form  of  semantics  [Car89b] 
(where  coherence  is  not  an  issue  because  the  interpretation  of  a  judgement  e:  s  is  given  as  the 
equivalence  class,  in  s,  of  the  interpretation  of  the  erasure  of  e — hence  the  meaning  is  not  defined 
inductively  on  a  derivation).  For  the  full  calculus  the  answer  is  stiU  unknown  as  this  paper  is  being 
written.  Amadio’s  thesis  contains  some  results  about  the  relationship  between  explicit  coercions 
and  PER  inclusion  [Ama91]. 

Equational  Theory.  The  reader  has  probably  noted  that  we  have  never  offered  an  equational  theory 
for  SOURCE,  only  one  for  TARGET.  At  the  current  time,  the  proper  equational  theory  for 
SOURCE  is  stiU  a  subject  of  active  research.  However,  our  translation  does  suggest  an  equational 
theory.  One  can  prove  that  two  terms  of  SOURCE  axe  equal  by  showing  that  their  translations 
are  equivalent  in  the  equational  theory  for  TARGET.  Any  of  the  models  we  have  proposed  wiU 
satisfy  the  resulting  equational  theory.  (Whether  this  is  also  true  of  the  interpretation  of  [BL88] 
may  follow  if  this  interpretation  is  the  same  as  ours.)  Since  our  translation  is  computable,  it  follows 
that  this  reflected  equational  theory  for  SOURCE  is  recursively  enumerable;  it  is  natural  to  ask 
for  a  reasonable  axiomatization  of  this  theory.  Note,  for  example,  if  e  =  e':s  holds  in  SOURCE 
and  s  <  t,  then  e  =  e':t  also  holds  in  the  reflected  theory.  There  are  probably  many  similarly 
interesting  derived  equational  rules. 

Recursion  Any  attempt  to  provide  a  model  for  a  calculus  which  combines  inheritance  and  recursion 
must  deal  with  the  seemingly  contradictory  semantic  characteristics  of  inheritance  and  recursion  at 
higher  types.  Ordinarily,  the  rule  for  inheritance  between  exponentials  (function  spaces)  is  given 
as  follows: 

u  <  s  t  <  V 
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where  s,t,u,v  are  type  expressions  and  <  is  the  relation  of  inheritance  (reading  s  <  t  as  “s 
inherits  from  t”).  Note,  in  particular,  the  contravariance  in  the  first  argument  of  the —»■  operator. 
In  contrast,  semantic  domains  which  solve  recursive  domain  equations  such  as  jD  =  D  D  are 
generally  constructed  using  a  technique — adjoint  pairs  to  be  precise — which  make  it  possible  to 
“order”  types  using  a  concept  of  approximation  based  on  the  rule 

where  (f)  =  {4>^,<f)^)  and  V’  =  ^re  adjoint  pairs  and  cf)  rp  is  the  adjoint  pair 

(A/,  0  f  o  (p^,  A/.  o  /  o  (p^).  Note,  for  this  case,  the  covariance  in  the  first  argument  of 

the  operator.  Because  of  this  difference,  models  such  as  the  PER  interpretation  of  Bruce  and 
Longo  [BL88],  which  provides  a  semantics  for  inheritance  and  parametric  polymorphism,  do  not 
evidently  extend  to  a  semantics  for  recursive  types.  To  provide  for  recursive  types  under  this 
interpretation  M.  Coppo  and  M.  Zacchi  [Cop85,  CZ86]  utilize  an  appeal  to  the  structure  of  the  un¬ 
derlying  universal  domain,  which  is  itself  an  inverse  limit  which  solves  a  recursive  equation.  R.  Ama- 
dio  [Ama89,  Ama90]  and  F.  Cardone  [Car89b]  have  explored  this  approach  in  considerable  detail. 
There  has  also  been  progress  on  understanding  the  solution  of  recursive  equations  over  domains 
internally  to  the  PER  model  which  should  provide  further  insights  [FMRS89,  Fre89].  On  the  other 
hand,  models  such  as  those  of  Girard  [Gir86]  and  Coquand,  Gunter  and  Winskel  [CGW87,  CGW89], 
which  handle  parametric  polymorphism  and  recursive  types,  do  not  provide  an  evident  interpreta¬ 
tion  for  inheritance.  It  has  been  the  purpose  of  this  paper  to  resolve  this  problem  by  an  appeal 
to  the  paradigm  of  “inheritance  and  implicit  coercion”.  However,  this  leaves  open  the  question  of 
how  recursive  types  can  be  treated  with  this  technique  if  one  is  to  include  a  more  powerful  set  of 
rules  for  deriving  inheritance  judgements  between  recursive  types. 

One  complicating  problem  is  to  decide  exactly  what  form  of  inheritance  between  recursive  types 
is  desired.  For  example,  it  seems  very  reasonable  that  if  s  is  a  subtype  of  t  then  the  type  of  lists 
of  s’s  should  be  a  subtype  of  lists  of  t’s.  This  is  not  actually  derivable  in  the  inheritance  system 
described  in  this  paper  since  there  are  no  rules  for  inheritance  between  recursive  types.  But  care 
must  be  taken:  if  s  is  a  subtype  if  t  then  is  the  solution  of  the  equations  a  =  a  — >■  s  be  a  subtype  of 
the  solution  of  a  =  a  t?  There  are  several  possible  approaches  to  answering  this  question.  The 
PER  interpretation  provides  a  good  guide:  we  can  aak  whether  the  solutions  of  these  two  equations 
have  the  desired  relation  in  the  PER  model.  Concerning  the  coercions  approach  we  are  forced  to 
a.sk  whether  there  is  any  intuitive  coercion  between  these  two  types.  If  there  is,  we  have  not  seen 
it!  It  is  reasonable  to  conjecture  that  inheritance  relations  derived  using  the  following  rule  will  be 
acceptable: 

C,  a<  Top  s  <  t 
C  h  pa.  s  <  pa.  t 

where  types  s  and  t  have  only  positive  occurrences  of  the  variable  a.  Unfortunately,  this  misses 
many  interesting  inheritance  relations  that  one  would  like  to  settle.  Discussions  of  this  problem 
will  appear  in  several  future  publications  on  this  subject.  A  rather  satisfactory  treatment  using 
coercions  has  been  described  in  [BGS89]  by  using  the  “Amber  rule”  of  CardeUi  [Car86]. 

Operational  semantics.  Despite  its  importance  there  is  virtually  no  literature  on  theoretical  issues 
concerning  the  operational  semantics  of  languages  with  inheritance  polymorphism.  In  particular, 
at  the  time  we  are  writing  there  are  no  published  discussions  of  the  relationship  (if  any!)  of  the 
denotational  models  which  have  been  studied  to  the  intended  operational  semantics  of  a  program¬ 
ming  language  based  on  the  models.  In  fact,  the  operational  semantics  of  no  existing  “practical” 
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programming  language  is  based  on  the  kind  of  semantics  discussed  in  this  or  any  of  the  other  papers 
on  the  semantics  of  Fun.  This  is  because  there  is  a  divergence  between  the  “traditional”  style  of 
semantics  for  the  A-calculus  and  the  way  the  evaluation  mechanisms  of  modern  functional  program¬ 
ming  languages  actually  work.  In  particular,  no  functional  programming  language  in  common  use 
evaluates  past  a  lambda  abstraction.  Hence  the  identification  of  the  constantly  divergent  function 
with  the  divergent  element  will  cause  the  denotational  semantics  to  fail  to  be  computationally  ad¬ 
equate  with  respect  to  the  evaluation.  Another  related  problem  concerns  the  use  of  the  /3-rule  and 
caU-by- value  evaluation.  Many  of  the  functional  programming  languages  now  in  use  evaluate  aU 
actual  function  parameters.  This  evaluation  strategy  immediately  causes  the  full  /3-rule  to  fail.  For 
example,  the  application  of  a  constant  function  to  a  divergent  argument  will  diverge  in  general. 
Semantically,  this  means  that  terms  of  higher  type  must  be  interpreted  as  strict  functions.  In  a  sub¬ 
sequent  paper  [BGS90],  three  of  the  authors  of  the  current  document  have  explored  the  operational 
semantics  of  inheritance  with  a  coercion  semantics  in  a  caU-by- value  setting.  The  results  there  are 
intuitively  pleasing,  but  there  is  much  more  that  needs  to  be  done.  This  direction  of  investigation 
offers  several  opportunities  for  practical  applications  of  the  specification  and  implementation  of 
compilers  and  interpreters  for  new  languages  with  inheritance. 

Existentials.  We  have  omitted  discussion  of  existentials  in  this  paper.  We  beheve  that  the  coherence 
results  we  have  described  will  extend  to  a  suitable  interpretation  of  the  e.xistential  types  using  the 
equational  theory  for  weak  sums,  but  did  not  choose  to  involve  ourselves  in  additional  cases  that 
this  would  mean  for  our  proofs. 

Order-sorted  algebra.  The  use  of  coercions  in  a  first-order  setting  has  been  investigated  in  work  of 
J.  A.  Goguen,  J-P.  Jouannaud  and  J.  Meseguer  on  order-sorted  algebras  [GJM85,  GM].  In  partic¬ 
ular,  the  implementation  of  0BJ2  utilized  a  form  of  “inheritance  as  implicit  coercion”  approach. 
Related  work  by  Bruce  and  Wegner  appears  in  [BW90]. 

Abstract  coherence.  Since  there  are  many  different  calculi  for  which  a  coherence  theorem  is  inter¬ 
esting,  it  is  very  useful  to  have  a  more  abstract  theory  from  which  special  instances  of  coherence 
can  be  derived,  thus  making  coherence  a  more  routine  part  of  a  semantic  theory  for  an  inheritance 
calculus  such  as  the  one  we  have  discussed.  We  mentioned  earher  that  coherence  was  an  issue  in 
category  theory  and  this  might  provide  a  framework  for  a  more  general  theory.  (Although,  the 
results  on  coherence  in  the  category  theory  literature  are  insufficient  for  the  results  of  this  paper 
so  further  extensions  will  be  needed).  Using  rewriting  techniques,  Curien  and  GheUi  have  devel¬ 
oped  a  type-theoretic  approach  to  the  abstract  coherence  problem  for  F<  which  is  a  subsystem 
of  SOURCE  featuring  only  function  and  bounded  generic  types  [CG90].  It  would  be  interesting 
to  see  this  technique  extended  to  all  of  SOURCE,  especially  in  view  of  the  compUcations  we 
encountered  with  variants. 

Subtyping  of  bounded  quantification.  Our  main  coherence  result  was  proved  for  a  weaker  version 
of  the  system,  one  that  uses  the  rule  (W-FORALL)  instead  of  (FORALL)  (see  Appendix  A).  We 
beheve  that  this  is  only  a  technical  restriction  that  arose  from  our  particular  proof,  and  that 
coherence  holds  for  the  stronger  system.  A  proof  would  however  require  a  way  to  circumvent  the 
usage  of  Lemma  11  in  the  treatment  of  the  [CASE]  rule  in  Lemma  12,  since  Lemma  11  fails  when 
(FORALL)  is  postulated  (for  a  counterexample,  see  Giorgio  GeUi’s  dissertation  [Ghe90]).  Perhaps 
greatest  lower  bounds  and  least  upper  bounds  can  be  replaced  by  some  canonical  choice  of  lower 
and  upper  bounds,  a  choice  that  may  result  from  the  derivation  of  the  typing  judgement  itself. 

Record  update.  For  practical  appUcations  of  calcuh  such  as  Fun,  a  particularly  important  problem 
concerns  the  semantics  of  “record  update”.  The  idea  is  this:  given  a  function  f-.s-^t  and  a  record 
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e  with  a  field  /  of  type  s,  we  would  like  to  modify  or  update  the  I  field  of  e  by  replacing  e.l  by  f{e.l) 
without  losing  or  modifying  any  of  the  other  fields  of  e.  The  development  of  calculi  which  can  deal 
with  this  form  of  polymorphism  and  the  ways  in  which  Fun  and  related  languages  can  be  used  to 
represent  similar  techniques  are  an  object  of  considerable  current  investigation.  One  recent  effort 
in  this  direction  is  [CM89]  but  several  other  efforts  are  under  way.  Despite  its  importance  we  have 
not  explored  this  issue  in  this  paper  since  the  discussion  about  it  is  very  unsettled  and  it  will  merit 
independent  treatment  at  a  later  date. 

We  believe  that  the  “inheritance  as  implicit  coercion”  method  is  quite  robust.  For  example,  it 
easily  extends  to  accommodate  “constant”  inheritances  between  base  types,  such  as  int  <  real  , 
as  long  as  coherence  conditions  similar  to  the  ones  arising  in  the  proofs  of  the  relevant  lemmas  in 
this  paper  hold  between  the  the  constant  coercions  which  interpret  these  inheritances.  Moreover, 
we  expect  that  our  methods  wiU  extend  to  the  functional  part  of  Quest  [Car89a]  and  to  the  language 
described  in  [CM89],  using  the  techniques  of  Coquand  [Coq88]  and  Lamarche  [Lam88].  Current 
work  on  inheritance  and  subtyping  such  as  [CHC90]  and  [Mit90]  wiU  provide  new  challenges.  We 
do  not  claim  that  every  interesting  aspect  of  inheritance  can  necessarily  be  handled  in  this  way. 
However,  our  treatment,  by  showing  that  inheritance  can  be  uniformly  eliminated  in  favor  of 
definable  coercion,  provides  a  challenge  to  formalisms  which  purport  to  introduce  inheritance  as  a 
fundamentally  new  concept.  Moreover,  our  basic  approach  to  the  semantics  of  inheritance  should 
provide  a  useful  contrast  with  other  approaches. 
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Appendix  A  :  The  language  SOURCE 

Type  expressions: 

Fragment:  a  |  Top  \  s—*t  \  {/i:  Sj,  |  Va  <s.  /  |  pa.t 

Variants:  |  [/i:  ti, 

where  a  ranges  over  type  variables,  m,n>  1,  and,  in  \/a<s.t  ,  a  cannot  be  free  in  s.  We  will 
use  [s/o]t  for  substitution. 

Raw  terms: 

Fragment: 

X  I  d{e)  I  Xx:t.  e  \  {/i  =  ei, . .  .,lm  =  €m}  |  |  Aa<t.  e  |  e{t)  |  intro[pa.  t]e  |  elim  e 

Variants: 

1  [^1  •  ^1 5  ■  •  •  5  —  ^1  •  •  •  t  ^n-  tn]  I  case  e  of  ./"l »  ■  -  ■  >  In  ^  fn 

where  x  ranges  over  (term)  variables  and  m,n  >  1.  (Note  the  type  decorations  on  variant 
“injections” ;  this  is  necessary  for  the  uniqueness  of  type  derivations  in  the  inheritance-less  system 
and  it  differs  from  [CW85].) 

Raw  terms  are  type-checked  by  deriving  typing  judgements,  of  the  form  F  h  e  :  t  .  where 
r  is  a  context.  Contexts  are  defined  recursively  as  follows:  0  is  a  context;  if  F  is  a  context  which 
does  not  declare  a,  and  the  free  variables  of  t  are  declared  in  F,  then  F,  a<t  is  a  context;  if  F  is 
a  context  which  does  not  declare  x,  and  the  free  variables  of  t  are  declared  in  F,  then  F,  x:t  is 
a  context.  The  proof  system  for  deriving  typing  judgements  makes  use  of  inheritance  judgements 
which  have  the  form  C  s  <  t  where  C  is  an  inheritance  context.  Inheritance  contexts  are 
contexts  in  which  only  declarations  of  the  form  a<t  appear.  If  F  is  a  context,  we  denoted  by  f 
the  inheritance  context  obtained  from  F  by  erasing  the  declarations  of  the  form  x-.t. 

Rules  for  deriving  inheritance  judgements: 

Fragment: 


(TOP) 


C  P  t  <  Top 

where  the  free  variables  of  t  are  declared  in  C 


(VAR) 


Cl,  a<t,  C2  P  a,  <  t 


(ARROW) 


CPs<t  C  u  <  V 
C  P  t^u  <  s—>-v 


C  \-  Si  <  ti  •  •  •  C  Sp  <  tp 


C  P  {li- Si,  .  .  .  ,  Ip.  Sp,  .  .  . ,  Iq.  Sqj  C:  {^1  •  ^1 ,  .  .  .  , /p.  tp} 


(RECD) 
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(FORALL) 


C  h-  s  <  t  C,a<s  u  <  V 

C  F  ya<t.u  <  ya<s.v 


For  Lemmas  11  and  12,  and  for  Theorem  13  this  is  replaced  with  the  weaker 


(W-FORALL) 


C,  a<t  F  u  <  V 
C  F  Va<t.  u  <  Va<t.  n 


(REFL) 


C  t  <  t 


where  the  free  variables  of  t  are  declared  in  C 


(TRANS) 


C  ^  r  <  s  C  s  <  t 
C  F  r  <  t 


Variants: 


(VART) 


_ C  F  .si  <  ti  •  •  •  C  F  Sp  <  tp _ 

C  F  [/j:  Si, . . . , /pi Sp]  ^  [W'l’ij  •  •  • 


Rules  for  deriving  typing  judgements: 
Fragment: 


[VAR] 


Fi,  x:t,T2  F  X  :  t 


[ABS] 


r,  x:  s  F  e  :  t 
r  F  Xx:s.  e  :  s—>-t 


[APPL] 


r  F  d  :  s-^t  r  F  e  :  s 
r  F  d{e)  :  t 


P  F  Cl  :  *  *  *  P  F 

P  F  Ijn  —  ^m}  •  {^1-  ^1?  -  •  •  1  t-m} 


[RECD] 
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[SEL] 


r  H  e  .  , . . . ,  tm} 

r  h  e.li  :  ti 


[B-GEN] 


r,  a<s  h  e  :  i 
r  h  Aa  <  5.  e  :  Va  <  5.  i 


[B-SPEC] 


rPe:Va<5.  f  ri-r<5 
r  P  e(r)  :  [r/a]i 


[R-INTRO] 


The:  [pta.  Z/a]f 
r  P  intro[^a.  t\e  :  pa.  t 


[R-ELIM] 


r  P  e  :  fxa.  t 
r  P  elim  e  :  [pa.  t/a]t 


[INK] 


rpe;5  rp5<i 
r  P  e  :  i 


Variants: 


[VART] 


r  P  e  :  ti 

r  P  [/j  .t\,  .  .  .  ,li  —  6,  .  .  . ,  /tj.  tjil  .  [/i .  ■  tn] 


[CASE] 


r  P  e  :  [/i:  ti,  tn]  P  P  fi  :  ti—*t  •  •  •  P  P  fn  ■  in— *t 
r  P  case  e  of  ^  /i  /„  :  t 
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Appendix  B:  The  language  TARGET 
Type  expressions: 

Fragment:  a  j  j  {li'.si,. .  |  Va.  t  |  pa.  t 

Variants:  |  [/i:  ti, 

Coercion  space:  1  s  o-;.  f 

where  a  ranges  over  type  variables  and  n  >  1.  For  m  =  0  we  get  the  empty  record  type  1  {}. 

Raw  terms: 

Fragment: 

X  I  M{N)  1  Xx:t.M  \  —  .  .,lrn  =  Mm}  I  M.l  I  Aa.  M  \  M{t)  \  intro[/xa.  t]M  j  elim  M 

Variants: 

1  [ii  -  •  •  •  1  h  —  ^  j  •  •  •  tin'  tn]  1  case  M  of  l\  =>■  F ^ 

Coercion-coercion  combinator; 

1 

Coercion  combinators: 

1  top[t]  I  arrow[s,t,  u,  v]  ]  recd[5i, . .  . .  .,tp]  |  forall[5,t,  a,  u,  n]  | 

vart[si,  ...,Sptti,...,tg\  I  refl[t]  |  trans[r,5,t] 

where  x  ranges  over  (term)  variables  and  n  >  1.  For  m  =  0  we  get  the  empty  record,  for  which 
we  will  keep  the  notation  {}  .  We  will  usually  omit  the  cumbersome  type  tags  on  the  coercion(- 
coercion)  combinators.  We  use  [N/x]M  for  substitution. 

Typing  judgements,  have  the  form  T  t-  M  :  t  ,  where  T  is  a  typing  context.  Typing  contexts 
are  defined  recursively  as  follows;  0  is  a  context;  if  T  is  a  context  which  does  not  declare  a,  then 
T,  a  is  a  typing  context;  if  T  is  a  context  which  does  not  declare  x,  and  the  free  variables  of  t  are 
declared  in  T,  then  T,  x:t  is  a  typing  context. 


Rules  for  deriving  typing  judgements: 

Fragment: 

Same  as  in  Appendix  A;  [VAR]  ,  [ABS]  ,  [APPL]  ,  [RECD]  (in  particular,  for  n  =  0,  Th 
{}:!),  [SEL]. 


[GEN] 


T, a  h  M  :  t 
T  h  Aa.  M  :  Va.  t 


[SPEC] 


T  h  M  ■.'ia.t 
T  h  M{s)  ;  [s/a]t 


Same  as  in  Appendix  A:  [R-INTRO]  ,  [R-ELIM]. 


32 


Breazu-Ta.nnen,  Coquand,  Gunter,  Scedrov 


Variants: 

Same  as  in  Appendix  A:  [VART]  ,  [CASE]. 

Coercion(-coercion)  combinators: 

We  omit  the  typing  contexts  to  simplify  the  notation. 


top[<]  :  t  o-^  1 

arrow[s,  n,  n]  :  (s  o-^  t)-*(uo-^  o-^[s^v)) 

recd[sx , . . . ,  s^,,  , . . . ,  tp]  .  (sj  o— >  tj )  ^  (Sp  o-^-  tp)  *•  ({/i •  -Si , . . . ,  Ip'  Sp, . . .  ,lq'.  Sg}  o — >'{/i :  ti, . . . ,  Ip',  tp}) 

forall[s,t,a,n,'i;]  :  (so->t)^Va.  ((a  o-+ s)-+ (u  o-^  n))-+(Va.  ((a  o-v  t) u)  o->- Va.  ((a  o-*- 5)  — »■  n)) 
vart[si , . . . ,  5p,  , . . . ,  tg]  .  o— +  ti')  •  *■  (sp  o— »■  tp)  >•  ([/i:  , . . .  ,lp'.  Sp]  0— »-[/i :  tj, . . . ,  Ip',  tp, . . . ,  Igi  tg]) 


refl[t]  :  t  o-+ 1 

trans[r,  s,  t]  :  (r  o-»  s)  (5  o-s- 1)  — v  (r  o->- 1) 


Equational  theory: 

Technically,  equational  judgements  should  all  contain  a  typing  context  under  which  both  terms 
in  the  equation  typecheck  with  the  same  type  [CGW87,  BC88,  CGW89].  To  simplify  the  notation, 
we  win  in  most  cases  omit  these  contexts. 

Fragment: 

We  omit  the  simple  rules  for  reflexivity,  symmetry,  transitivity,  and  congruence  with  respect 
to  function  application,  record  formation,  field  selection,  application  to  types,  recursive  type  intro¬ 
duction,  and  recursive  type  elimination. 


{XI} 


T,x:s  h  M  =  N 
T  h  Xx'.s.M  =  Xx'.s.N 


r,a  M  =  N 


{TYPE-XI} 


T  h  Aa.M  =  Aa.N 
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{RECD-BETA}  {h  =  Mi, . .  .,lm  =  =  Mi 

where  m  >  1,  Mi'.ti, . . Mm-tm  ■ 

{RECD-ETA}  {li  =  M.li,...,lm  =  M.lm}  =  M 

where  M  :  {h:  ti,...,  Im-tm}  •  For  m  =  0,  this  rule  gives  {}  =  M  which  makes  1  into  a 
terminator. 

{FORALL-BETA}  (Aa.M)(r)  =  [r/a]M 


{FORALL-ETA}  Ka.M{a)  =  M 

where  M  :  'ia.  t  and  a  not  free  in  M. 

{R-BETA}  dim  (intro[/ia.  t]M)  =  M 

where  M  :  pa.t  . 

{R-ETA}  intro[/ia.t](dim  M)  =  M 

where  M  :  [pa-tja]!  . 

Variants: 

We  omit  the  simple  rules  for  congruence  with  respect  to  variant  formation,  and  case  analysis. 
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{VART-BETA}  case  \r\']i^{Mi)  oili^  Fi, ..  Fn  =  Fi{Mi) 

where  Fi  :  ti—yt, . . .  ,Fn  :  tn^t,  Mi  :  ti  and  inj;^  is  shorthand  for 
Xx .  ti.  [/j .  ti,  .  .  .  ,  li  —  X ,  .  .  In.  tn] . 

{VART-ETA}  case  M  of /i  ^inj;^ , . . . , /„=>inj;^  =  M 

where  M:  [/i:  tj,  4]  . 

{VART-CRN}  i(P)(case  M  of case  M  oUi^  Fi;  t{P), ...,  In  =>  Fn;  l{P) 

where  M:  [/i: /i,  f„],  Fi.ti^t, . . . ,  Fn'.tn^t,  Piters  . 

Alternatively,  we  could  require  instead  of  {  VART-ETA  }  +  {  VART-CRN  }: 

{VART-COP}  -  case  M  of  ..,ln^i\njijL{Q)) 

where  M:  [/j:  fi,  i„],  (Q:  [/i:  ti  t  . 

Coercion(-coercion)  combinators: 

i(top)  =  Xx:t.  {} 

i(arrow(P)((5))  =  Xz:t^u.  {l{P));  z]{l{Q)) 

where  P:so-^t,  Q-.uo—^v. 

reed  (Pi )  ■  ■  ■  (Rp))  ~  Xw.  {/i:si,. . .  ,lp'.  Sp, . . . ,  Iq'.  Sq} .  {/i:  i(Pi)(tn./i), . . .  ,lp‘.  i(Pp)(u;./p)} 

where  Pi :  5i  0 — *■  ti, . . . ,  Rp'.  Sp  o— »■  tp  . 

i(forall(P)(W))  =  Xz:  (Va.  (a  o->  Aa.  Xf:  ao-*s.  i(lV(a)(/))(2(a)(trans(/)(P))) 

where  P:  s  o-+ t,  TV;  Va.  (a  o-»- s)  ^  (n  o-*- n). 

i(vart(Pi )  ■  •  ■  (Pp))  =  Xw:[li:si,...,lp:  5p].  case  w  of  inj^^  ,...,lp=>  t(Pp);  inj,^ 

where  Pi :  si  o-^  ti , . . . ,  Rp.  SpO-^tp  . 

i(refl)  =  Xx'.t.x 
L{Uans{P){Q))  =  l{P);l{Q) 

where  P:  r  o-s-  s,  Q:s  o->  t. 

^jP)  =  ‘-(Q) 

P  =  Q 


{lOTA-INJ} 
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Appendix  C:  The  translation 

We  present  first  the  remaining  of  the  translation  of  the  fragment  discussed  in  section  3. 


(VAR)‘ 


c;,a,  f:a-^t*,c;  h  /:  a-^f 


(RECD)* 


c*  h  Pi :  •••  c*  h  Pp : 

where  R  Xw:  {Ip  s^, . . .  Jp.  s*, . . .  ,lg:  s*}.  {Ip.  Pi(w.li), . . .  ,1^:  Pp{w.lp)} 


(REFL)* 


C*  h  Xx:t\x:  P -*t* 

where  the  free  variables  of  t*  are  declared  in  C” 


(TRANS)* 


C*  h  P  :  C*  Q  : 

C*  h  P]Q  :  r*-^P 


The  rules  [VAR]  ,  [ABS]  ,  [APPL]  ,  [RECD]  ,  [SEE]  ,  [R-INTRO]  ,  [R-ELIM]  are  translated 
straightforwardly,  see  below.  Here  is  the  translation  of  the  only  other  rule  left  (the  translations  of 
the  other  rules  appears  in  section  3). 


[B-GEN] 


_ r%  g,  f:a-^s*  \-  M  :  t* _ 

r*  h  A.a.  Xf:a—>-s*.  M  : 


In  the  following,  we  present  the  translation  for  the  full  calculus.  As  before,  for  any  SOURCE 
item  we  will  denote  by  item*  its  translation  into  TARGET  .  We  begin  with  the  types.  Note  the 
translation  of  bounded  generics  and  of  Top. 


a* 

def 

a 

(Va<  s.  t)* 

def 

Va.  ((a 0— »■  s*)—>-t*) 

Top* 

def 

1 

{pa.  t)* 

ckf 

pa.  t* 

-ty 

def 

s*  ^  r 

\ly:  $1,  .  .  . ,  In'.  5n] 

def 

[lps{,.. 

'  >  ^n*  '5;^] 

■■SmY 

def 

{Ip  5i,  .  . 

• )  ■Sm} 

where  s  X  t  =  {left:  s,  right:  t}. 

One  shows  immediately  that  ([s/a]t)*  =  [s*/a]t*  .  We  extend  this  to  contexts  and  inheritance 
contexts,  which  translate  into  just  typing  contexts  in  TARGET  . 


(E,  a<t)*  r*,  a, /:ac>-^r  (C,  a<t)*  C\a,f:a^P 

{T,x:ty  T%x:r 
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where  /  is  a  fresh  variable  for  each  (a,/). 

Next  we  will  describe  how  we  translate  the  derivations  of  judgments  of  SOURCE  .  The  transla¬ 
tion  is  defined  by  recursion  on  the  structure  of  the  derivation  trees.  Since  these  are  freely  generated 
by  the  derivation  rules,  it  is  sufficient  to  provide  for  each  derivation  rule  of  SOURCE  a  corre¬ 
sponding  rule  on  trees  of  TARGET  judgments.  One  then  checks  that  these  corresponding  rules 
are  directly  derivable  in  TARGET  (Lemma  14  below),  therefore  the  translation  takes  derivations 
in  SOURCE  into  derivations  in  TARGET  . 

A  SOURCE  derivation  yielding  an  inheritance  judgment  C  \-  s  <  t  is  translated  as  a 
tree  of  TARGET  judgments  yielding  C*  h  F  :  s*  .  Here  are  the  TARGET  rules  that 
correspond  to  the  rules  for  deriving  inheritance  judgements  in  SOURCE. 


(TOP)*  C*  h  top  :  r  1 


(VAR)*  Cl ,  a,  f-.ac^  /*,  P 


(ARROW)* 


C*  P  P  :  s*  r  C*  h  Q  :  u*  u* 
C*  P  arrow(P)((5)  :  (t* ^v*) 


(RECD)* 


C*  P 


C*  P  Pi  :  st  o->  tl 


recd(Pi)---(Pp)  : 


•  •  •  C*  Pp  :  Sp  o->- tp _ 

•  ■  ,lp-  Sp,  .  .  .  ,  Ig'.  S*}  °-^{ll-  tl,  .  .  .  ,  Ip',  t*} 


(FORALL)* 


C*  P  P  :  s*  r  C*,  a,  /:  a  s*  P  Q  :  u*  o-^  v* 

C"  P  forall(P)(Aa.  A/;  a  o->  s*.  Q)  :  Va.  ((a  0-+ 1*)  — j-u*)  o— »•  Vg.  ((a  o— s- s*)  — >■  u*) 


(VART)’ 


C*  P  Pi 


^1  o — ►  ^1 


C*  P  P. 


C*  P  vart(Pi)  •  ■ -(Pp)  :  [h:  si, . .  .,lp:  s*]  o-*[li:q, . . .  ,lp:t*, . . .  ,lg:t* 


(REFL)*  C*  P  refi  :  r  CK-.  r 

where  the  free  variables  of  C  are  declared  in  C* 


(TRANS)* 


C*  P  P  :  r*  s*  C*  Q  :  s*  o-^r 
C*  P  trans(P)(0  :  r*  o— *■  t* 
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A  SOURCE  derivation  yielding  an  typing  judgment  F  I-  e  :  t  is  translated  as  a  tree  of 
TARGET  judgments  yielding  F*  h  M  :  i*  .  Here  are  the  TARGET  rules  that  correspond  to 
the  rules  for  deriving  typing  judgements  in  SOURCE. 

The  rules  [VAR]  ,  [ABS]  ,  [APPL]  ,  [RECD] ,  [SEL]  ,  [R-INTRO]  ,  [R-ELIM]  ,  [VART]  ,  [CASE] 
all  have  direct  correspondents  in  TARGET  so  their  translation  is  straightforward.  We  ilustrate  it 
with  two  examples. 


[VAR]*  Ft,  x:r,  F^  h  X  ;  r 


[B-SPEC]* 


F*  h  M  :  Va.((ao^s*)^r)  f*  P  P  :  r*  s* 
F*  h  M(r«)(P)  ;  [r»/a]r 


[INH]* 


F-  H  M  :  5*  f*  P  P  :  s*  r 
F*  H  i{P){M)  :  t* 


Lemma  14  The  rules  (TOP)*  —  (TRANS)*  and  [VAR]*  —  [INH]*  are  directly  derivable  in  TAR¬ 
GET  .  I 
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Abstract.  This  paper  relates  two  views  of  the  operational  semantics  of  a  language  with  multiple 
inheritance.  It  is  shown  that  the  introduction  of  explicit  coercions  as  an  interpretation  for  the 
implicit  coercion  of  inheritance  does  not  affect  the  evaluation  of  a  program  in  an  essential 
way.  The  result  is  proved  by  semantic  means  using  a  denotational  model  and  a  computational 
adequacy  result  to  relate  the  operational  and  denotational  semantics. 


1  Introduction 

There  have  been  a  number  of  efforts  to  understand  the  denotational  semantics  of  inheritance  poly¬ 
morphism  and  a  variety  of  mathematical  models  for  languages  with  subtle  semantic  features  have 
been  discovered.  However,  as  far  as  the  authors  of  this  paper  know,  no  one  has  attempted  to  discuss 
what,  if  anything,  these  denotational  models  have  to  do  with  the  intended  execution  of  programs  in 
the  languages  they  model.  For  example,  all  of  the  published  denotational  models  of  the  language 
Fun  of  Cardelli  Wegner  [CW85]  (including  the  work  of  authors  of  this  paper)  model  this  language 
in  way  that  corresponds  to  no  reasonable  interpretation  of  its  operational  semantics!  No  functional 
programming  language  in  common  use  diverges  when  evaluating  the  program  Xx.  e,  even  when  the 
expression  e  may  diverge.  Yet  the  models  for  Fun  which  have  been  studied  identify  the  abstrac¬ 
tion  Xx.  ±  with  the  divergent  program  ±.  Besides  this  problem,  all  existing  models  satisfy  the 
unrestricted  0  rule,  which  fails  to  be  a  legitimate  transformation  in  call-by-value  languages.  Since 
call-by-value  is  the  most  common  form  of  evaluation,  one  is  led  to  ask  whether  this  commitment 
to  0  was  an  important  feature  of  the  models  concerned.  In  short,  very  little  has  been  done  to  close 
the  gap  between  denotational  and  operational  theories  of  inheritance.  We  see  two  basic  things  as 
missing  from  the  current  theories;  (1)  a  careful  discussion  of  the  structional  operational  semantics 
of  languages  with  inheritance  type  systems  and  (2)  any  account  of  the  relationship  between  the 
suggested  models  and  a  reasonable  account  of  operational  semantics. 

Our  goal  in  this  paper  is  to  attempt  an  account  of  problem  (1)  guided  by  an  approach  to  (2). 
We  carry  out  this  study  in  a  simple,  familiar  context  by  using  an  extension  of  Plotkin’s  Illustrative 
language  PCF  [Plo77].  We  develop  a  simple  structural  operational  semantics  for  this  language  in 
the  spirit  of  the  evaluation  mechanisms  of  languages  such  a.s  LISP  and  ML  in  which  functions  caU 
their  parameters  by  value.  Our  extension,  which  we  caU  PCF-f,  is  obtained  by  adding  record  and 
variant  types.  This  language  is  extended  to  a  new  language,  PCF-f-t-,  by  permitting  the  use  of 
a  form  of  inheritance  which  aUows  more  programs  to  be  viewed  as  type  correct.  We  then  study 
the  question  of  the  proper  operational  interpretation  of  PCF-f-f-.  One  possible  approach  is  simple 
to  understand:  after  a  PCF-|--|-  program  is  shown  to  be  type  correct,  the  type  information  in  the 
term  is  erased  and  the  resulting  term  (which  Uves  in  an  extended  untyped  lamb  da- calculus)  is 

^Appears  in  Conference  on  Lisp  and  Functional  Programming,  edited  by  M.  Wand,  Nice,  France,  July 
1990,  pp.  44-60. 
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evaluated.  However,  in  view  of  the  form  of  semantics  that  we  have  studied  in  our  work  on  Fun 
and  its  relatives  [BCGS89,  BCGS90]  there  is  another  view  of  the  proper  operational  semantics  of 
PCF++.  Under  this  view,  a  term  of  PCF++  is  translated  into  a  PCF+  term  by  inserting  explicit 
coercions  which  “explain”  the  inheritance  in  the  original  PCF++  program  in  an  intuitive  way.  If 
this  “explanation”  really  is  intuitive  and  the  first  form  of  evaluation  (which  is  a  common  from  of 
implementation)  is  reasonable,  then  it  seems  that  there  must  be  some  relationship  between  these 
two  views  of  program  evaluation!  Moreover,  this  latter  approach  is  also  not  uncommon  as  a  form 
of  evaluation,  and  therefore  has  independent  interest.  In  this  paper  we  wiU  show  that  these  two 
forms  of  evaluation  are  essentially  the  same  for  observable  types. 

To  give  the  reader  an  idea  of  what  translation  we  have  in  mind  let  us  look  at  an  example  of  how 
a  simple  program  would  be  evaluated.  Applying  our  semantic  paradigm  to  PCF++,  we  translate.its 
programs  into  PCF+  programs,  essentially  by  inserting  explicit  coercion  terms  wherever  inheritance 
is  used  in  type-checking.  In  anticipation  of  an  exact  definition  of  this  translation  (section  2),  here 
is  an  example.  PCF-|--|-  type-checks  the  program  P  =  G(F)  where 

G  =  A/  :  {/  :  num}  ^  num.  {ki  =  /({/  =  0,/i  =  l}),k2  -  /({/  =  2,12  =  false})} 

F  =  Xx  :  {I  :  num}.  x.l 

Note  that  G  wiU  not  type-check  in  PCF-|-  because  of  the  different  types  of  the  two  arguments  to 
which  /  is  applied. 

The  translation  to  PCF-I-  depends  on  the  way  P  is  type-checked.  One  possible  translation  is 
P'  =  G'{F)  where 

G'  =  Xf  -.{I  :  num}  num.  {/ci  =  /(6({^  =  0,/i  =  l})),/:2  =  /(6({^  =  2,h  =  false}))} 
where  and  ^2  a-re  the  following  coercion  terms 


=  Xxi  :  {I :  num,  li  :  num}.  =  xi.l}  ^2  =  Aar2  :  {I :  num,  I2  :  bool}.  {/  =  X2.I}  . 
Another  possible  translation  is  P"  =  G"{F)  where 

G"  num}  num.  {A:i  =  Ci(/)({^  =  0,/i  =  l}),k2  =  ^2{f){{l  =2,12  =  false})} 


where 


and 


Cl  =  A/i  :  {/  :  num}  ^  num.  Axi  :  {/  =  num, /i  =  num}.  fi{^i{xi)) 
C2  =  A/2  :  {/  :  num}  num.  Xx2  :  {/  =  num,/2  =  bool}.  /2(6(a:2)) 


The  fact  that  the  translation  (more  generally,  the  meaning)  depends  on  the  type-checking  deriva¬ 
tion  entails  the  need  for  denotational  coherence  results  [BCGS89].  In  this  paper,  however,  we  will 
examine  the  computational  (operational)  aspects  of  this  translation.  Notice  that  the  “execution” 
of  both  P'  and  P"  yields  the  same  result 


{ki  =  0,k2  =  2} 


More  importantly,  so  does  the  direct  execution  of  P.  The  “direct”  operational  semantics  for  PCF-1-+ 
that  we  have  in  mind  is  just  the  same  as  that  of  PCF-|-.  It  is  a  simple  but  crucial  observation  that 
the  same  evaluation  rules  work  on  programs  allowed  by  the  more  permissive  type  discipline  of 
PCF-f--h.  Not  surprisingly,  this  is  the  natural  way  to  implement  such  languages  (CardeUi,  personal 
communication  about  Quest  [Car89]).  Although  it  is  may  not  be  useful  to  fuUy  translate  a  term 
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before  executing  it,  it  is  reasonable  to  ask  whether  translation  would  affect  the  evaluation.  Since 
coercions  remove  the  “junk”  in  a  term,  they  may  play  a  useful  role  in  efficient  implementation. 
However,  our  primary  interest  is  in  the  abstract  specification  of  the  language  and  not  the  details 
of  its  efficient  implementation. 

Our  main  result  relates  the  direct  execution  of  a  PCF++  program  phrase  e  to  the  execution  of 
any  of  its  PCF+  translations,  e*.  We  prove  that 

e  terminates  if  and  only  if  e*  terminates. 

If  both  e  and  e*  terminate,  what  can  we  say  about  the  relationship  between  the  results  of  the  two 
computations?  Of  course,  we  are  able  to  show  that  if  the  type  of  e  is  ground,  (integer  or  boolean) 
then  the  results  are  the  exactly  the  same.  In  this  language  we  are  also  interested  in  computing 
with  more  complex  objects,  such  as  records/variants  of  records/variants  of  ground  data  (this  is 
particularly  consistent  with  the  way  things  are  viewed  in  object-oriented  database  programming 
applications  [OBB89]  for  example).  We  caU  the  types  of  such  data  observable  types.  Now,  the 
philosophy  of  PCF-f -f  is  that  the  type  of  program  phrases  is  part  of  them,  i.e.,  user-supplied  in  some 
sense.  (This  is  in  contrast  with  the  approaches  based  on  type  inference;  see  for  example  [Wan89].) 
At  observable  types,  we  show  that  the  results  of  the  two  computations  have  the  same  components 
in  those  record  fields  which  appear  in  the  prescribed  type  of  the  program  phrase.  This  is  the  best 
we  can  hope  for,  since  the  introduction  of  coercions  yields  computations  which  may  remove  “junk” 
fields,  namely  the  fields  not  occurring  in  the  prescribed  type.  Moral:  if  you  specify  a  type  for  your 
program,  don’t  expect  to  observe  more  than  what  the  type  allows.  Anyway,  our  conclusion  is  that 
coercions  make  no  essential  difference  to  the  computation. 

While  this  result  only  relates  our  translation  to  the  operational  semantics,  it  can  be  used  for 
transfer  of  computational  adequacy.  Consider  a  denotational  semantics  2)+  of  PCF-f-  for  which  our 
translation  is  coherent.  This  yields  a  denotational  semantics  for  PCF-f-h  where  a  term  is 
interpreted  by  first  translating  it  into  PCF-1-  and  then  taking  the  I)+-meaning  of  the  translation. 
Under  some  reasonable  assumptions  about  our  main  result  implies  that  if  is  computa¬ 
tionally  adequate  (i.e.  the  meaning  of  a  term  e  is  non-bottom  iff  the  evaluation  of  e  terminates) 
for  the  operational  semantics  of  PCF-(-  then  V'^'^  is  computationally  adequate  for  the  operational 
semantics  of  PCF-t--}-. 

An  interesting  methodological  twist  is  that  our  proof  of  the  main  result  actually  uses  a  specific 
denotational  semantics  [•I"’"  which  is  computationally  adequate  for  PCF-1-  and  for  which  this  transfer 
can  be  done!  As  it  is,  we  show  directly  that  is  computationally  adequate  for  PCF-f -f  and  we 
derive  our  main  result  from  this.  We  regard  this  as  a  nice  example  of  the  use  of  a  domain-theoretic 
semantics  for  obtaining  an  essentially  syntactic  result. 

Another  comment  on  methodology.  We  have  chosen  to  focus  on  call-by-value  operational  se¬ 
mantics  since  this  is  the  most  common  style  of  implementation  for  the  languages  we  are  studying 
and  because  it  offers  a  change  of  pace  from  our  earlier  results  [BCGS89]  where  we  focused  on  models 
in  which  the  unrestricted  0  axiom  holds.  We  expect  that  results  such  as  the  ones  we  are  proving  in 
this  paper  could  be  formulated  for  a  call-by-name  operational  semantics,  although  this  would  caU 
for  some  changes  in  our  concept  of  observability. 

In  section  2  we  begin  by  introducing  the  syntax  of  PCF-f -f  as  an  extension  of  PCF-f.  Then  we 
describe  the  translation  back,  from  PCF-f -f  to  PCF-f.  Finally  we  give  the  caU-by- value  operational 
semantics  and  state  our  main  theorem.  In  section  3  we  give  a  domain-theoretic  denotational 
semantics  of  PCF-f  for  which  our  translation  is  coherent  and  for  which  the  operational  semantics 
of  PCF-f  is  sound  a'nd  computationally  adequate.  We  prove  that  the  operational  semantics  of 
PCF-f-f  is  sound  and  computationally  adequate  for  the  induced  denotational  semantics  and  then 
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we  show  how  to  derive  from  this  our  main  theorem.  The  paper  ends  with  a  section  of  conclusions 
and  ideas  for  more  work. 

2  From  PCF+  to  PCF++  and  back  again. 

In  this  section  we  introduce  the  two  calculi  on  which  the  central  result  of  the  paper  focuses. 

2.1  Extending  PCF+  to  PCF++- 

The  following  grammar  defines  the  syntax  of  type  expressions  s  and  raw  terms  e  of  our  calculi.  We 
assume  primitive  syntax  classes  of  variables  and  labels: 

X  G  Variable 

I  G  Label 

s  ::=  num  I  bool  I  s -i- s  I  {/i  :  si,.  Sn}  I  [^1  :  ^i,  s„] 

e  ::=  0  I  Succ(e)  |  Pred(e)  |  true  |  false  |  IsZero(e)  | 

a;  :  s  I  Ax  ;  s.  e  I  e(e)  \  px  :  s.  e  \  li  e  then  e  else  e  | 

{/i  =  e, . . . ,  /n  =  e}  I  e.l  |  [/  =  e]  |  case  e  of  /i  e,  e 

For  records  =  ei,  =  e„}  and  variants  [/j  =  ei, . . . ,  =  €„],  it  is  assumed  that  the  labels 

li, . .  .,ln  are  aU  distinct.  We  assume  that  the  reader  can  infer  from  our  notation  what  is  meant  by 
free  and  bound  variables  of  raw  terms.  A  raw  term  is  said  to  be  closed  if  it  has  no  free  variables. 

A  type  context  is  a  list  xj  :  si, . . . ,  x„  :  of  pairs  of  variables  and  types.  We  assume  that  the 
variables  x,-  in  such  a  context  are  distinct.  A  typing  judgement  is  a  sequent  of  the  form  H  \-  e  :  s 
where  ff  is  a  typing  context  which  includes  all  of  the  free  variables  of  the  raw  term  e.  A  typing 

judgement  is  said  to  be  derivable  in  PCF+  if  it  can  be  proved  using  the  axioms  and  rules  listed  in 

Table  1.  It  is  not  hard  to  see  that  any  derivable  sequent  has  a  unique  derivation.  This  latter  fact 
will  not  be  true  of  the  calculus  PCF++  which  we  now  define.  PCF++  is  the  extension  of  PCF+ 
to  a  calculus  with  multiple  inheritance.  First  of  ail,  we  define  a  binary  relation  s  <  t  of  subtyping 
between  type  expressions  s  and  t  using  the  rules  in  Table  2.  The  reader  can  check  that  <  is  a 
preorder  on  type  expressions.  This  relation  is  now  incorporated  into  the  typing  system  of  PCF++ 
by  the  addition  of  the  subsumption  rule: 

H  e  :  s  s  <  t 
H  e  :t 

2.2  Translation  from  PCF++  into  PCF+- 

Definition:  Given  types  s  and  t  such  that  s  <  t  is  provable,  we  define  a  PCF+  term  coerce[s  <t] 
of  type  s  — >  <  by  induction  on  the  proof  of  s  <  t  cis  follows: 

•  coerce[bool  <  bool]  =  Ax  :  bool,  x  and  coercefnum  <  num]  =  Ax  :  num.  x. 

•  coerce[s  t  <  s'  t']  =  Xf  :  s  —>■  t.  coerce[<  <  P]  p  /  o  coerce[s'  <  s] 

®  S  ay  s  ™  and  t  *  ^71}  and  s  ^  t,  then 

coerce[s  <  t]  =  Xx  :  s.  {/i  =  coerce[si  <  ti](x./i), . .  .In  =  coerce[si  <  <„](x./„)} 
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Table  1:  Typing  rules  for  PCF+. 


num  <  num 

s'  <  s  t  <  t' 

bool  <  bool 

s  i  <  s'  t' 

s\  <h  • 

Sfi  <C  tji 

{/l  .  ,  .  .  . ,  /jx  Z  Sji  y  .  ,  ,  jlfji 

'  {^1  •  ?  •  •  •  9  •in} 

Si  <ti  ■ 

Sn  in 

[^1  •  ’^1  ?  '  *  •  5  •  ^n]  [^1  • 

ilt  •••  ^  in  •  iny  •••  f  im  ■  ^m] 

Table  2:  Inheritance  rules. 
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•  S^y  3  —  [/j  .  S\ , .  *  * ,  Iji  .  3jII(1  t  —  .  t\, , , , ,  Iji  .  tji , . . . ,  lyji  .  ^77^]  3/11  d.  s  t ,  tlicii 

coerce[s  <  t]  =  Xx  :  s.  case  x  of  /i  ^  /i, 

where  fi  =  Xy  :  S{.  [/,•  coerce[s,-  <  ti]{y)]  for  each  i  =  1, . . n.  | 

Lemma  1  If  s  <  t  is  derivable,  then  so  is  h  coerce[5  <  ^]  :  5  — >  /  | 

We  will  now  describe  how  we  translate  the  derivations  of  typing  judgments  of  PCF++  into 
derivations  of  PCF+.  The  translation  is  defined  by  recursion  on  the  structure  of  the  derivation 
trees.  Since  these  are  freely  generated  by  the  typing  rules,  it  is  sufficient  to  provide  for  each 
rule  of  PCF++  a  corresponding  rule  on  trees  of  PCF+  judgments.  For  the  correspondence  which 
we  describe,  it  is  possible  to  show  that  these  corresponding  rules  are  directly  derivable  in  PCF+, 
therefore  the  translation  takes  derivations  in  PCF++  into  derivations  in  PCF+. 

A  PCF++  derivation  A  yielding  an  inheritance  judgment  H  e  :  s  is  translated  as  a  tree  TA 
of  PCF+  judgments  yielding  a  translation  T*A  of  the  form  H  e*  :  s.  All  of  the  rules  of  PCF++ 
except  the  subsumption  rule  are  translated  “without  change.”  For  example,  the  axiom  0  :  num  is 
translated  as  itself,  whereas  the  rule 

H  \-  e  :  num 
H  H  Succ(e)  ;  num 

is  translated  as 

H  e*  :  num 
H  h  Succ(e*)  :  num 

where  FT  h  e*  :  num  is  the  root  of  the  translation  of  the  derivation  of  if  h  e  :  num.  Only  the 
subsumption  rule  is  altered  by  the  translation.  In  particular,  the  rule 

H  \-  e  :  s  s  <  t 
H  e:t 

is  translated  by  the  rule 

H  \-  e*  :  s  coerce[s  <t]:s^t 
H  1-  coerce[s  <  t](e*)  :  t 

which  “makes  the  implicit  coercion  explicit.” 

It  is  not  hard  to  see  that  a  PCF++  typing  judgement  may  have  many  different  derivations. 
The  reader  may  wish  to  look  at  different  possible  derivations  for  the  term  in  the  introduction  to 
get  a  sense  of  why  this  is  the  case.  This  presents  a  problem  for  the  translation:  is  there  any  sense 
in  which  two  translations  to  PCF+  of  a  given  PCF++  term  are  related?  In  particular,  this  paper’s 
main  theorem  can  be  used  to  demonstrate  a  close  relationship  between  the  operational  semantics 
of  the  two  translations. 

2.3  Operational  semantics  and  Main  Theorem. 

The  operational  semantics  of  the  closed  raw  terms  of  is  given  by  the  least  relation  JJ.  between  raw 
terms  and  canonical  forms  which  satisfies  the  rules  and  axioms  in  Table  3.  Canonical  froms  are 
defined  as  follows:  0,  true,  and  false  are  canonical  forms.  For  any  expression  e.  Ax  :  s.  e  is  a 
canonical  form.  If  ci, . .  .c„  are  canonical  forms,  then  {/i  =  ci, . . .,  /^  =  Cn}  is  a  canonical  form.  If 
c  is  a  canonical  form,  then  Succ(c)  and  [/  =  c]  are  canonical  forms.  We  may  also  write  e  Ij.  if  there 
is  a  canonical  form  c  such  that  e  Jj.  c. 
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0  1].  0  true  Ij.  true  false  JJ.  false 

e  Jj-  Succ(c)  e  JJ.  c 

e  JJ-  0  e  JJ.  Succ(c) 

Pred(e' 

JJ.  c  Succ(e)  JJ.  Succ(c) 

IsZero(e)  JJ.  true  IsZero(e)  JJ.  false 

Ax  :  s.  e  JJ.  Ax  :  s.  e 

e  JJ.  Ax  :  s.  e"  e'  JJ.  c'  [c'/arje"  JJ.  c 

e{e')^c 

Cl  JJ.  true  e2  JJ.  c 

ei  JJ.  false  63  JJ.  c 

if  ei  then  62  else  63  JJ.  c 

if  Cl  then  62  else  63  JJ.  c 

Cl  Jj'  Cl  •  *  '  Gfi  Jj-  Cfi 

C  Jj-  Cl ,  .  .  .  ,  /yj  = 

{^1  =  Cl, 

.  •  —  ^n}  Jj'  {^1  —  — 

Cn}  C./j‘  JJ-  C{ 

e  JJ.  c 

e  JJ.  [/,•  =  c']  /,(c')  JJ.  c 

[/  —  Jj-  —  c]  C3.SG  C  of  /i  flj  *  *  *  y^i  fiy^'^y^n  ^  fn  Jj'  ^ 

[px. 

;/x]e  JJ.C 

px 

.  e  JJ.  c 

Table  3:  Call-by-value  evaluation. 


For  raw  terms  e  and  e'  we  write  [e'/x]e  for  the  result  of  substituting  e'  for  x  in  e.  We  demand 
all  of  the  usual  assumptions  about  the  renaming  of  bound  variables  in  e  to  avoid  capturing  free 
variables  of  e' .  We  assume  that  the  substitution  operation  associates  to  the  right  and  we  may  write 
[ei, . . . ,  Cnlxi, . . Xn]e  for  the  simultaneous  substitution  of  ei, . . e„  for  xj, . . respectively  in 
e.  In  the  event  that  the  terms  e,-  are  closed,  note  that  this  is  the  same  as  [ei/xi]  •  ■  -[en/xnje  and, 
indeed,  the  order  of  the  substitutions  does  not  matter. 

It  is  not  hard  to  see  that  if  e  is  a  closed  raw  term  such  that  e  -ij.  c,  then  c  is  uniquely  determined. 
This  can  be  proved  by  showing  that,  for  a  given  term  e,  there  is  at  most  one  axiom  or  rule 
from  Table  3  which  applies  to  it.  Hence  the  rrdes  define  a  deterministic  evaluation  strategy.  The 
evaluation  of  function  application  is  call- by- value,  since  the  argument  to  the  application  is  evaluated 
before  being  substituted  into  the  body  of  the  applied  procedure.  There  is  no  evaluation  under  a 
lambda-abstraction,  but  note  that  records  are  eagerly  evaluated.  For  example,  the  evaluation  of 
an  expression  {/  =  e,l'  =  e'}.l  will  result  in  the  evaluation  of  e'  as  well  as  e  even  though  e'  is  “not 
needed”  in  the  result.  Putting  aside  efficiency  issues,  this  is  only  significant  if  e'  diverges  since,  in 
that  case,  the  evaluation  of  {I  =  e,/'  =  e'}.l  will  also  diverge.  Since  evaluation  is  deterministic,  we 
may  define  a  partial  function  £  on  raw  terms  as  follows 

t  ~  ^  e  .11  c  if  there  is  such  a  c 

^  ~  1  undefined  otherwise 
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We  use  the  symbol  ~  between  mathematical  expressions  to  indicate  that  one  of  the  expressions 
being  related  may  be  undefined.  In  general,  for  expressions  E  and  E',  E  ~  E'  means  that  if  either 
E  or  E'  is  defined,  then  so  is  the  other  and  the  values  are  the  same. 

Let  h  e  :  s  be  a  judgement  which  type-checks  in  PCF-f  and  suppose  e  JJ-  c.  It  is  easy  to  show 
that  h  c  :  s  also  type-checks  in  PCF-f.  This  fact  is  less  obvious  for  PCF-f-f-.  We  express  it  in  the 
following: 

Lemma  2  Suppose  e  :  s  is  derivable  in  PCF++  and  e  jj.  c,  then  h  c  :  s.  | 

This  sort  of  result  is  closely  related  to  the  subject  reduction  theorems  that  appear  in  type  theory 
research. 

Let  e  be  raw  term  such  that  h  e  :  s  is  a  derivable  in  PCF-b-b  and  suppose  e*  is  a  translation  of 
e  into  PCF-b.  Our  central  question  is  this:  what,  if  anything,  is  the  relationship  between  £(e)  and 
£{e*)?  Naturally,  we  might  start  by  guessing  that  £{e)  ~  £{e*)  in  the  sense  that  when  one  of  them 
exists,  then  so  does  the  other,  and  the  results  of  evaluation  are  syntactically  identical.  However, 
it  does  not  take  much  looking  to  see  that  the  syntactic  identity  may  fail  in  some  cases.  First  of 
aU,  if  £{e)  is  a  record,  then  it  may  contain  some  fields  which  do  not  appear  in  the  result  £{e*) 
of  evaluating  the  coerced  term  since  the  latter  evaluation  wiU  include  coercions  which  may  strip 
various  fields  in  the  course  of  the  evaluation  of  e*.  Moreover,  if  £{e)  is  a  lambda  term,  then  £(e*) 
may  contain  unexecuted  coercions  in  its  body  which  do  not  appear  in  £(e).  Worse  yet,  it  seems  that 
even  two  different  translations  of  e  :  s  may  have  different  canonical  forms!  Hence  we  cannot  expect  a 
result  as  simple  as  the  one  just  proposed  and,  indeed,  we  cannot  expect  a  simple-minded  statement 
of  an  operational  coherence  result.  Nevertheless,  there  are  some  obvious  counter-observations  to 
the  problems  just  mentioned.  In  the  case  of  records,  the  extra  fields  which  appear  in  ^(e)  may  be 
“junk  fields”  which  were  not  mentioned  in  the  type  s.  One  might  argue  that  it  is  not  even  desirable 
that  the  result  of  the  evaluation  should  have  fields  not  included  in  the  specified  type  s.  Could  it 
be  that  £{€)  and  f(e‘)  share  “essential”  fields  in  common?  Also,  the  problem  with  higher  types 
(lambda-abstractions)  misses  a  central  point:  the  “appearance”  of  a  term  at  non-observable  type  is 
not  important.  Since  most  interpreters  do  not  display  any  description  of  a  higher-order  procedure, 
we  are  interested  only  in  the  applicative  behavior  of  such  terms  in  observable  contexts.  Our  goal  is 
therefore  to  define  what  we  mean  by  an  observable  type  and  define  a  notion  of  essential  observable 
equivalence  for  PCF-f -t-  judgements  at  these  types. 

Definition:  Types  bool  and  num  are  ground  types.  A  type  s  is  observable  if 

•  s  is  a  ground  type,  or 

•  s  =  :  si,  s„}  where  Si, . .  .,s„  are  observable  types,  or 

•  s  =  [li  :  Si, . . .  ,ln  ■  Sn]  where  si , . . . ,  s„  are  observable  types.  | 

Definition:  The  relation  =s  between  canonical  forms  of  PCF-f -f  observable  type  s  is  defined 
inductively  as  follows: 

•  If  s  is  a  ground  type,  then  c  =s  c'  iff  c  =  c'. 

•  Let  s  =  {li  :  Si, . . .  Jn  ■  Sn},  then 

{/l  —  Cl, ...,/„  —  Cn  1 1  j  —  — 5  0l  ^l7***5^n  i  •  '  '  —  ^k} 

iff  c,  c(  for  z  =  1, . . . ,  n. 
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•  Let  5  =  [/i  :  5i,  5„],  then  [U  =  c,]  [Ij  =  Cj]  iff  c,-  =5,  c'-.  | 

If  E  and  E'  are  expressions  that  may  be  undefined,  write  E  ~s  E'  to  mean  that  if  one  expression 
exists,  then  so  does  the  other  and  E  =3  E'.  We  may  now  express  the  desired  result: 


Main  Theorem:  Suppose  \-  e  :  s  is  derivable  in  PCF++  and  e*  is  any  PCF+  term 
which  translates  this  sequent,  then  e  -D-  iff  e*  Moreover,  if  s  is  observable,  then 
€ie)  ~3  £{e').  I 


It  seems  difficult  to  prove  this  result  directly  because  of  the  recursion  case.  This  problem  is  resolved 
by  appealing  to  denotational  models  for  PCF+  and  PCF++  which  we  now  describe. 

3  A  computationally  adequate  denotational  semantics. 

For  technical  reasons  we  have  found  that  it  is  useful  to  appeal  to  some  results  relating  PCF+  and 
PCF++  to  a  specific  denotational  model  which  we  will  describe  in  this  section.  Although  our  goal 
is  to  prove  a  purely  syntactic  result  (the  Main  Theorem  at  the  end  of  the  previous  section),  the 
semantic  results  which  we  wiU  now  estabhsh  are  of  independent  interest. 

We  describe  a  domain-theoretic  model  for  PCF-f.  The  interpretation  of  types  is  as  follows; 

•  fbool]  is  the  flat  domain  with  three  distinct  elements  tt,  j^^and  least  element  ±. 

•  |num|  is  the  flat  domain  consisting  of  the  numbers  0, 1,2, . . .  together  with  a  least  element 

J.. 

•  [5  — >■  t|  =  (s  OH- 1)±,  the  lifted  domain  of  strict  {i.e.  T-preserving)  functions  from  |s]  into  |t]. 

•  I  {^1  :  Si,...,ln  :  -Sn}  1  consists  of  a  bottom  element  X,  together  with  the  set  of  tuples 

{/i  =  di,  =  dn}  where  each  d,-  is  a  non-bottom  element  of  |s,-|.  The  ordering  is  defined 

by 

{fi  =  di,  =  d„}  C  {/i  =  dj,  =  d(j} 

iff  di  C  d'-  for  each  i  =  1 , . . . ,  n  and  J.  C  d  for  each  record  d. 

•  I  [^1  :  -si?  •  ■  :  "S„]  ]  consists  of  a  bottom  element  X,  together  with  the  set  of  pairs  [/,-  =  d,-] 

such  that  di  is  a  non-bottom  element  of  |si|.  For  two  such  pairs,  [li  =  d]  C  [Ij  =  d'j  iff  i  =  y 
and  d  C  d'. 

Suppose  H  =  Xi  :  Si, ..  .Xn  '•  Sn  is  a  type  context.  An  H-environment  is  a  function  which  assigns 
to  each  variable  x,-  an  element  p(x,)  of  the  domain  |s,-|.  The  PCF-f  interpretation  of  a  sequent 
H  \-  e  :  s  is  a  function  which  assigns  to  each  Lf -environment  p  a  value  \H  e  sl'^+p  in  [s|. 

We  wiU  refrain  from  writing  out  aU  of  the  semantic  equations  for  the  sequents  of  PCF+.  The 
rules  for  the  introduction  and  elimination  operators  for  the  record  and  variant  types  are  straight¬ 
forward,  holding  in  mind  that  the  interpretation  of  a  record  with  a  field  which  is  X  is  itself  equal 
to  X.  Recursion  is  defined  in  the  usual  way  using  least  fixed  points.  The  function  space  requires 
some  explanation  which  we  now  provide. 

The  lift  D±_  of  a  domain  D  is  obtained  by  adding  a  new  bottom  element.  There  is  a  continuous 
function  up  :  D  — )•  D±_  which  sends  elements  of  D  to  their  images  in  the  hfted  domain.  This  function 
is  not  strict,  since  it  sends  the  bottom  of  D  to  an  element  of  jDi  which  dominates  the  “new”  bottom 
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element.  There  is  a  unique  continuous  strict  function  down  :  B±  D  such  that  (down  o  up)(a;)  =  x 
for  any  x  and  (up  o  down)(2/)  =  y  for  any  y  ^  ±.  This  equational  relationship  between  the  two 
functions  plays  an  essential  role  in  the  computational  adequacy  result  which  we  will  state  later. 
Now,  the  meaning  of  a  derivable  typing  judgement  of  the  form  H  h  Xx.  e  :  s  t  is  given  as  follows; 

{H  h  Xx.  e  :  5  — >•  tj'^p  —  up(strictAd  £  pj.  {H,  x  :  s  \-  e  :  p[d/x]) 


where  p[d/x]  is  the  H,x  :  s  environment  which  is  the  same  as  p  except  it  sends  x  to  d  (we  assume 
that  X  is  a  “fresh”  variable  which  does  not  appear  in  H)  and  the  second  lambda  abstraction  is  the 
“semantic”  notation  for  a  function  which  takes  an  argument  d  £  [s].  Since  the  interpretation  a 
function  apphcation  to  a  program  with  value  T  should  have  value  T  to  model  caU-by- value  properly, 
one  must  apply  the  function  strict  defined  as 


strict(/)(x) 


/(x)  if  X  T 
T  if  X  =  T 


The  resulting  strict  function  is  hfted  by  the  function  up  to  insure  that  its  value  is  non-bottom. 
Again,  this  will  be  important  later  when  we  prove  a  correspondence  between  operational  divergence 
and  having  T  as  a  meaning.  Under  our  intended  operational  semantics,  no  lambda-abstraction  is 
a  divergent  program.  The  definition  of  application  is  given  as  follows: 

\H  h  e(e')  ;  =  down([^f  h  e  :  s  ^  t]'*'p)(|ir  h  e'  :  sjp) 

We  may  now  show  how  our  model  for  PCF-fi  can  be  used  to  construct  a  model  for  PCF+-1-. 
Following  ideas  from  [BCGS89]  we  use  the  following  Semantic  Coherence  Theorem  due  to  Rick 
Blute: 

Theorem  3  (Semantic  Coherence)  IfT  and  A  are  PCF++  derivations  of  a  sequent  H  \-  e  :  s, 
then  {H  b  T\T)  :  s}++  =  1111-  r*(A)  :  sj++.  | 

A  similar  result  was  a  central  objective  of  the  work  in  [BCGS89]  where  the  coherence  is  proved  for 
a  class  of  models  of  an  equational  theory.  The  model  here  diflFers  from  the  ones  considered  there 
since  the  unrestricted  0  rule  does  not  hold  in  the  model  we  have  described  in  the  current  paper. 
The  semantic  function  for  sequents  of  PCF-f-b  is  now  defined  as  follows:  {H  e  :  sj'*"'"/?  =  [111- 
e*  :  where  H  1-  e*  :  s  is  any  translation  of  .ff  h  e  :  5.  If  we  note  that  any  PCF-|-  derivation  is 

a  PCF-(--f-  derivation,  then  we  get  the  following  corollary: 

Corollary  4  If  H  \-  e  :  s  is  a  derivable  judgment  of  PCF+,  then  \n  h  e  :  sj+p  =  [H  h  e  :  s|'^'’‘p 
for  any  H -environment  p.  | 

As  we  shall  see  later,  this  corollary  permits  us  to  transfer  some  hard-earned  results  about  PCF-t--|- 
to  results  about  PCF-I-.  In  light  of  the  corollary,  we  may  sometimes  omit  the  tags  on  the  semantic 
brackets  for  PCF-f-  derivable  typing  judgements. 

We  now  wish  to  show  that  the  semantics  for  PCF+-|-  which  we  have  just  defined  is  closely 
related  to  its  operational  semantics.  Here  is  our  first  crucial  relationship: 

Theorem  5  (Soundness)  If  e  :  s  is  derivable  in  PCF-t-t,  and  e  JJ-  c,  then  |e  :  =  |c  :  | 

We  have  omitted  the  proof,  which  is  straight-forward  but  tedious.  We  mention  only  the  following 
facts  which  are  needed: 
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Lemma  6  1.  If  r  <  s  <  t,  then  |coerce[r  <  t]  :  r  ^  t]  =  [coerce[s  <  i]  :  5  — »•  o  |coerce[r  <  s]  : 

r  s] 

2.  If  s  <  t,  then  |coerce[s  <  t]  :  s  iff  d—  ±.  H 

Lemma  7  If  h  c  :  s  is  a  derivable  judgement  of  PCF++  and  c  is  a  canonical  form,  then  |c  : 

^  -L.  I 

Most  of  the  rest  of  this  section  is  devoted  to  a  proof  of  a  kind  of  converse  to  the  Soundness 
Theorem  which  we  will  caU  computational  adequacy  (the  term  is  suggested  by  Albert  Meyer  [Mey88], 
although  his  definition  includes  soundness).  For  PCF++,  it  can  be  stated  as  follows: 

Theorem  8  (Computational  Adequacy.)  Suppose  e  :  s  is  derivable  in  PCF++.  If  |e  :  s]"*"*"  ^  ± 
then  e  c  for  some  canonical  form  c. 

We  focus  on  explaining  how  the  methods  that  one  uses  for  results  such  as  those  above  are 
apphed  to  a  calculus  with  multiple  inheritance.  We  will  look  at  the  proof  of  adequacy  in  some 
detail.  The  proof  requires  a  relation  between  program  meanings  and  programs  sometimes  called 
an  “inclusive  predicate”.  We  define  this  relationship  as  follows: 

Definition:  Define  a  relation  <s  between  elements  of  |s]  on  the  left  and  closed  raw  terms  of  type 
s  on  the  right  as  follows,  d  ;£s  e  if  d  =  ±  or  e  (I  c  for  some  c  and  d  <5  c  where 

•  /  ;Ss— i  Ax  :  r.  e  iff  for  each  d  £  |s|  and  term  c,  d<s  c  implies  down(/)(d)  <4  [c/x]e. 

•  —  d\,...,lji  —  dfi)  {^1  —  iff  m  ^  n  and  di  C{  for 

i  =  1, . . .n. 

•  [h  =  d]  :s„]  [ij  =  c]  iff  i  =  j  and  d  <,.  c. 

•  tt  <bool  true  and  ff  ;£bool  false. 

•  0  £num  0  and  if  n  ;£num  c  for  a  number  n,  then  n  +  1  Snum  Succ(c).  | 

Some  of  the  essential  semantic  properties  of  ;£  are  given  in  the  following: 

Lemma  9  1.  If  a  Q  b  :<s  e,  then  a  e. 

2.  If  ao  Q  ai  Q  02  Q  ■  ■  •  is  an  ascending  chain  and  Un  <3  e  for  each  n,  then  jj^o  ~s  e.  | 

We  are  now  ready  to  sketch  the  proof  of  the  primary  technical  lemma  which  is  needed  for  the  proof 
of  PCF++  adequacy. 

Lemma  10  Suppose  if  =  xi  :  . . .  x„  :  sj,  and  H  e^  :  5^  is  derivable.  If  di  £  |s||  and  di  e] 

for  i  =  I, . .  .,k,  then  {H  \-  e^  :  s^'>"^[di, . .  .,dk/xi, . . .  ,Xk]  [e|, . . . ,  e|/xi, .  ..,Xk]e^. 

Proof:  Let  p  be  the  environment  [di, . .  .,d„/xi, . .  .x„]  and  a  be  the  substitution 
[e|, . . .,  ei /xi, . . ., x„].  Let  A  be  a  PCF++  derivation  of  the  typing  judgement  .ff  1-  :  s^.  We 

prove  that  [H  h  ;£^t  cre^  by  an  induction  on  A.  Assume  that  the  Theorem  is  known 

for  proofs  of  lesser  height.  There  are  eleven  possibilities  for  the  last  step  of  A.  Some  of  the  more 
interesting  cases  (subsumption  in  particular)  are  written  out  fuUy  below. 


12 


Brea.zu-Ta.nnen,  Gunter,  Scedrov 


•  Base  case:  H  Xi  :  sj. 

Suppose  the  sequent  h  an  axiom  of  the  form  above  (he.  e^  =  Xi).  Then  we  have 

|if  h  Xi  :  =  di  <^t  gj-  =  axi  by  assumption. 

ff  X  S  C  ’  t 

•  Lambda  abstraction:  — - - 

H  r  Xx  :  s.  e  \  s  ^  t 

Suppose  the  last  inference  in  A  has  the  form  above  (in  particular,  e'^  ;  =  Aa:  :  s.  e  :  s  — >  t). 

Let  A'  be  part  of  the  proof  A  which  proves  H,  x  :  s  e  :  t  and  suppose  H,  x  :  s  h  e*  :  t  is 
T*A'.  Let  /  =  IH  \-  Xx  :  s.  e  :  s  —>■  =  fH  Xx  :  s.  e*  :  s  ^  and  suppose  d  <s  c. 

We  must  show  that 

d'  =  down(/)(d)  (o'Xx  :  r.  e)(c)  (1) 

If  d'  =  jL  then  there  is  no  problem.  Suppose  d'  ^  T,  then  down(/)  _L,  so 

d'  =  down(up(strictAd"  6  fsj.  \H,x  :  s  h  e*  :  tY' p[d" j x])){d) 

=  (strictAd"  G  [sj.  [df,  a:  :  s  L  e’  ;  t]"^p[d"/x])(d) 

and  there  are  two  cases,  if  d  =  T,  then  d'  =  T  <<  (T[c/x]e  as  desired.  However,  if  d  7^  _L,  then 

d'  —  \H,x  :  s\- e*  :tY' p[dlx] 

=  [df,  X  :  s  h  e  :  t]‘^'*'p[d/x] 

<t  (7[c/x]e 


by  the  induction  hypothesis.  Since  d'  7^  J.,  there  is  a  canonical  c'  such  that  (7[c/x]e  Jj-  c'  and 
d'  c'.  Since  [c/xjcre  =  c7[c/x]e,  we  have  (Ax  :  s.  cre)(c)  =  (crAx  :  s.  e)(c)  so  it  must  be  the 
case  that  (crAx  :  s.  e)(c)  Jj.  c'  too,  so  1  holds. 


.  H  ei  :  s  ^  t  E  €2  '■  s 

Appl, cat, on:  - - 

Let  H  't-  e\  ■.  s  -*  t  and  df  h  62  :  5  be  translations  dictated  by  A.  If  d'  =  \H  \-  61(62)  : 
tl++p  =lHh  e^ie'Y  :  tj+p  f  A,  then  /  =  h  ei  ;  s  =  {H  \- el  :  s tj+p  7^  A 

and  d  =  |dr  1-  62  :  sl'’”''p  =  {H  h  62  :  si"*"  7^  A  (using  the  fact  that  all  our  functions  are 
strict!).  By  the  induction  hypothesis,  /  :  s  —>■  t  and  d  <5  <762  :  s,  so  there  is  a  term 

63  and  a  canonical  form  c  such  that 


<761  -D-  Ax  :  s.  63  and  /  Ax  :  s.  63 
<762  -II-  c  and  d<3  c 


Now  d'  =  down(/)(d)  [c/xjea  by  the  definition  of  so  there  is  a  canonical  c'  such  that 

[c/x]e3  J|  c'  and  d'  c'.  But  [c/x]e3  c'  means  (<7ei)((762)  -l|  c'.  Since  (<T6i)(cre2)  =  cr(6i(62)), 
we  have  d'  <7(61(62))  as  desired. 

.  H,  X  :  s  \~  e  :  s 
•  Recursion:  — — - . 

H  r  px  ■.  s.  e  \  s 

Let  dd,  X  :  5  h  6*  :  s  be  the  translation  dictated  by  A.  Let  do  =  A  and  dj-+i  =  |dd,x  : 
she:  5j''“''p[d,-/x]  =  |dd,  x  :  s  h  6*  :  s]+p[d,-/x].  We  show  that  d,-  <5  apx  :  s.  e  for  each 
i.  This  is  immediate  for  do  =  A.  Suppose  d,-  apx  :  s.  6.  By  induction  hypothesis, 
d,-+i  =  |dr,x  :  s  h  6  :  s|‘'"*'p[dj-/x]  <5  <7[px  :  s.6/x]e.  If  d,-+i  7^  A,  then  <7[px  :  s.  e/x]e  J|  c  for 
some  c  such  that  d,-+i  <5  c.  Now  a[px  :  s.  e/x]e  =  [apx  :  s.  elx\ae  =  [px  :  s.  aejx\ae.  Hence 
apx  :  s.  6  =  px  :  5.  (76  1|  c  as  well.  Since  [df  h  px  :  s.  6  :  sj'^'^p  =  [.ff  h  px  :  s.  6*  :  sj+p  = 
Uj^o  the  desired  result  follows. 
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Subsumption  rule:  — — . 

H  r  e  :  t 

The  proof  for  this  case  is  by  induction  on  the  height  of  the  proof  that  s  <  t.  Assume  that 
we  know  that  the  theorem  holds  foiHh-e'.s  and  let  ^  h  e*  :  s  be  any  translation  of  this 
sequent  to  PCF+.  There  are  four  subcases: 


—  Records: 


Base  types:  These  are  both  obvious  since  the  coercion  is  the  identity  map. 

u'  <  u  V  <  v' 

—  Functions:  - ; - r. 

u  —*  V  <  u'  v' 

Suppose  s  =  u  V  and  t  =  u'  v'.  Let  =  down[coerce[’u'  <  u]]  and  ^2  = 
down|coerce[u  <  u']|.  Then  ^  =  down|coerce[u  v  <  u'  -*  u']]  satisfies  ^(/)  = 
for  /  ;  1^1  cwlu].  Set  /  =  down[if  1-  e  :  If  d  <„/  c,  then  (2{d)  c  by 

induction  hypothesis  on  u'  <  u.  Thus  f{^2{d))  (o'e)(c)  by  induction  hypothesis 

on  1-  e  :  s.  We  may  now  apply  the  induction  hypothesis  on  1;  <  u'  to  conclude 
that  ^(/)  =  ^i{f{^2{d)))  ^v'  (o'e)(c).  Since  f(/)  =  [H  \-  e  :  tj'^'^p  we  conclude  that 
{H  \-  e  :  <4  ae. 

S\  <i  t\  •  •  •  Sfi  tn 

Kecoras:  yr - - — ; - - - ; - r - J- — — - —  '  ■ 

Let  =  down|coerce[si  <  L]|  for  i  =  l,...n  and  let  ^  =  down[coerce[s  <  t]l  By 

induction  hypothesis,  we  have  d  =  {H  \-  e  :  5|++p  <4  ae.  If  d  =  T,  then  ^(d)  =  [Lf  h 

e  :  p  =  T  and  we  are  done.  If  d  /  T,  then  d  =  {/j  =  di,...,/„i  =  d^}  where 

di,...,dTO  7^  -L  and  ae  il-  c  for  some  canonical  c  of  the  form  c  =  {/i  =  — 

Cj}  such  that  j  >  m  and  d,-  <5^  c,-  for  i  =  l,...m.  By  the  induction  hypothesis  on 

inheritance  judgements,  we  must  therefore  have  6(d,)  <1^  c,-  for  each  i  =  1, . . . ,  n.  Hence 
^(d)  =  {/i  =  ifi(di), . .  =  ^n(4)}  Si  {/i  =  ci,...,/j  =  Cj}  by  the  definition  of  <t 

and  we  are  done. 

T.  •  .  ‘  '  Sfi  tn 

Variants:  j- - - - - - - - - - - r. 

[ll  :  Si,  .  ,  . ,  Iji  '•  SnJ  •  tl  t  •  •  •  t  In  '•  ^ni  •  •  •  y  Im  •  J 

Let  =  down|coerce[5i  <  ti]]  for  i  =  l,...n  and  let  ^  =  down[coerce[s  <  i]l  By 

induction  hypothesis,  we  have  d  =  {H  h  e  :  sj'^+p  <s  ae.  If  d  =  ±,  then  ^(d)  = 

{E  e  :  tY'^p  =  ±  and  we  are  done.  If  d  7^  T,  then  d  =  [U  =  d,]  where  d^  7^  J. 

and  ae  i}  c  where  ere  JJ.  c  and  d  c.  By  the  definition  of  <s,  the  term  c  has  the  form 
[/,■  =  c,]  and  d,-  <si  Ci-  By  induction  hypothesis  on  s,-  <  tj,  we  know  that  Ci(d,)  <t;  c,-  so 
lid)  =  [/,■  =  ^i(d)]  <i  [/,■  =  c,-].  I 


-  Variants: 


We  may  now  express  the  desired  proof  of  Computational  Adequacy  for  PCF++. 

Proof:  (of  Theorem  8)  By  Lemma  10  we  know  that  |e  :  Ss  e.  Since  the  value  on  the  left  is 

assumed  to  differ  from  T,  the  Theorem  follows  immediately  from  the  definition  of  <5.  | 

The  following  theorem  follows  immediately  from  Soundness  and  Computational  Adequacy  for 
PCF++  together  with  Corollary  4  of  the  Semantic  Coherence  Theorem  for  PCF++. 

Theorem  11  (Soundness  and  Adequacy  for  PCF+)  Ifh  e:  s  is  derivable  in  PCF+,  then 

1.  (Soundness)  e  if  c  implies  [e  :  s]  =  [c  :  s|. 

2.  (Computational  Adequacy)  [e  :  sY  7^  T  implies  e  JJ.  c  for  some  canonical  form  c.  | 

The  following  lemma  is  needed  for  the  proof  of  the  Main  Theorem: 
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Lemma  12  Let  c  and  c'  be  canonical  forms  such  that  c  :  s  and  h  c'  :  s  are  derivable  in  PCF++ 
for  an  observable  type  s.  Then  |c  :  5]'*"*’  =  \c'  :  sj"*"*'  iff  c  =s  c' .  | 

Corollary  13  Let  e  and  e'  be  raw  terms  such  that  h  e  :  5  and  \-  e'  :  s  are  derivable  for  an  observable 
type  s.  Then  |e  :  5J++  =  |e'  :  5J++  iff 

Proof:  This  follows  from  Adequacy,  Soundness  and  Lemma  12.  | 


Main  Theorem:  Suppose  h  e  :  s  is  derivable  in  PCF++  and  e*  is  any  PCF+  term  which  translates 
this  sequent,  then  e  JJ.  iff  e*  JJ..  Moreover,  if  s  is  observable,  then  £{e)  ~s  S{e').  | 

Proof:  Suppose  e  JJ.  c.  Then  [e  :  5]"^+  —  [c  :  sj++  7^  T  by  the  Soundness  Theorem  for  PCF++ 
and  Lemma  7.  Since  \e  :  =  |e*  :  sj'*',  we  may  conclude  from  Soundness  and  Adequacy  for 

PCF+  that  there  is  a  canonical  form  c'  such  that  e*  JJ.  c'  and  [c'  :  sj"*"''  =  [c  :  sj’*""'’.  If  s  is  an 
observable  type  then  c  =s  c'  by  Lemma  12. 

Suppose  conversely  that  e*  JJ.  c'  for  some  canonical  c' .  By  the  Soundness  for  PCF+,  |e*  :  sj+  = 
|c'  :  s]+  7^  T.  Hence  |e  :  s]"''"*'  =  |e*  ;  sj"*"  7^  T.  By  Adequacy  and  Soundness  for  PCF++,  there  is 
a  canonical  form  c  such  that  e  JJ.  c  and  [c  :  s]+'‘‘  =  [c' :  Thus  c  ==3  c'  by  Lemma  12.  | 

4  Conclusions  and  directions  for  further  research. 

We  have  shown  that  inheritance-interpreted-as-definable-coercion  semantic  paradigm  behaves  well 
with  respect  to  operational  semantics.  More  specifically,  we  have  shown  that  the  coercion  terms 
that  we  introduce  in  this  interpretation,  while  possibly  generating  more  computation,  will  only 
generate  “harmless”  computation,  in  particular  that  no  unexpected  divergence  can  be  introduced, 
nor  can  expected  divergence  be  lost.  (In  the  process,  we  actually  exhibited  a  nice  domain-theoretic 
model  which  is  sound  and  computationally  adequate  for  PCF++’s  straightforward  operational 
semantics.) 

There  are  at  least  two  points  where  we  can  see  improvements  to  our  results.  One  problem 
is  that  we  would  like  to  strengthen  the  main  theorem  so  as  to  say  something  interesting  about 
the  relationship  between  €{e)  and  S{e*)  when  their  type  is  not  necessarily  observable.  The  other 
problem  is  that  the  proof  of  Theorem  5  does  not  really  use  the  particularities  of  the  denotational 
semantics  but  rather  the  fact  that  certain  identities  between  PCF+  terms  hold  in  it.  These  two 
points  are  related  and  here  is  a  conjectured  improvement  which  would  solve  both  problems. 

Suppose  hers  type-checks  in  PCF-t--b  and  let  e*  be  any  PCF+  translation  of  it.  Further 
suppose  that  e  JJ.  c  for  some  canonical  form  c.  By  our  main  theorem,  there  is  a  canonical  form  c' 
such  that  e*  JJ.  c' .  We  would  like  to  relate  c  and  c'  as  PCF+  terms,  but  1-  c  :  s  may  type-check  in 
PCF+-f-  only.  So,  let  c*  be  any  PCF-1-  translation  of  it.  What  do  we  know  about  the  relationship 
between  c*  and  c'?  It  is  a  consequence  of  the  soundness  results  that  the  model  we  introduce  in 
section  3  equates  them.  But  equality  in  this  model  is  H^-hard.  Surely  the  relationship  between  c* 
and  c'  is  much  simpler. . . 

We  believe  that  it  is  possible  to  formulate  a  reasonable  logical  theory  about  PCF+  terms,  call 
it  T  in  which  c*  and  c'  can  be  shown  to  be  provably  equal.  In  fact,  we  believe  that  such  a  theory 
would  be  closely  related  to  the  call-by- value  lambda-calculi  studied  by  Plotkin  and  Moggi  [Mog88]. 
This  result  would  have  the  following  pleasant  corollaries.  Let  V'^  be  any  denotational  model  of 
PCF+  in  which  the  operational  semantics  and  the  axiomatization  of  T  are  sound  (actually,  we 
expect  that  the  soundness  of  the  later  wiU  imply  that  of  the  former).  One  immediately  concludes 
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that  our  translation  is  denotationally  coherent  with  respect  to  1?+,  which  induces  a  model  V'^'^ 
of  PCF++,  and  that  the  operational  semantics  of  PCF++  terms  is  sound  in  Of  course,  by 

the  main  theorem  of  this  paper,  we  can  also  get  transfer  of  computational  adequacy.  Therefore, 
we  would  be  able  to  neatly  concentrate  in  the  axiomatization  of  T  all  the  conditions  needed  by  a 
“good”  model  of  PCF+  in  order  to  become  a  model  of  PCF++  in  accordance  to  our  paradigm. 

An  intriguing  question  is  whether  c*  =  c'  will  turn  out  to  be  more  than  an  r.e.  statement, 
whether  it  is  actually  decidable?  In  other  words,  is  full  PCF+  computation  required  in  order  to 
systematically  disentangle  the  coercions  we  introduce? 

Finally,  we  should  restate  that  we  expect  that  the  results  of  this  paper  generalize  to  more 
complicated  type  disciplines  (Fun,  Quest,  etc.)  and  that  analogs  can  be  shown  for  caU-by-name 
operational  semantics. 
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Abstract 

This  report  is  intended  to  describe  and  motivate  a  relationship  between  a  class  of  nets  and 
the  fragment  of  linear  logic  built  from  the  tensor  connective.  In  this  fragment  of  linear  logic  a  net 
may  be  represented  as  a  theory  and  a  computation  on  a  net  as  a  proof.  A  rigorous  translation 
is  described  and  a  soundness  Jind  completeness  theorem  is  stated.  The  translation  suggests 
connections  between  concepts  from  concurrency  such  as  causal  dependency  and  concepts  from 
proof  theory  such  as  cut  elimination.  The  main  result  of  this  report  is  a  “cut  reduction”  theorem 
which  establishes  that  any  proof  of  a  sequent  can  be  transformed  into  another  proof  of  the  same 
sequent  with  the  property  that  all  cuts  are  “essential” .  A  net-theoretic  reading  of  this  result  tells 
that  unnecessary  dependencies  from  a  computation  can  be  eliminated  resulting  in  a  maximdly 
concurrent  computation.  We  note  that  it  is  possible  to  interpret  proofs  as  arrows  in  the  strictly 
symmetric  strict  monoidal  category  freely  generated  by  a  net  and  establish  soundness  of  our 
proof  reduction  rules  under  this  interpretation.  Finally,  we  discuss  how  other  linear  connectives 
may  be  related  to  the  concepts  of  internal  and  external  choice. 


1  Introduction 

In  this  paper  we  explore  the  idea  of  describing  the  operational  semantics  of  a  net  (the  so-called 
“token  game”)  in  proof-theoretic  terms.  Under  our  approach,  a  net  will  correspond  to  a  logical 
theory,  and  the  token  games  on  the  net  will  be  represented  as  proof  trees  in  the  “logic”  of  the  net. 
This  correspondence  reveals  an  interesting  relationship  between  concepts  of  proof  theory  (such  as 
cut  elimination)  and  fundamental  concepts  in  concurrency  (such  as  causal  dependency)  as  they  are 

*This  b  an  extended  and  revised  version  of  the  preliminary  report  that  appeared  in:  Application  and  Theory 
of  Petri  Nets,  edited  by  G.  De  Michelis,  June  1989,  pp.  174-191. 

^Research  of  both  authors  is  supported  by  Office  of  Nav«il  Research  Grant  N00014-88-K-0557.  Electronic  mail 
addresses  for  the  authors  are  gimterClinc.cis.upenn.edu  and  gehlotClinc.cis.upenn.edu 
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illustrated  by  net  theory.  Our  proof-theoretic  representation  works  for  a  certain  class  of  nets  in 
which  events  are  uniquely  determined  by  their  pre  and  post  conditions.  Such  nets  are  represented 
as  sets  of  sequents  in  a  fragment  of  linear  logic  based  on  the  tensor  connective. 

Linear  logic  is  a  system  introduced  by  J.  Y.  Girard  based  on  the  inspiration  of  his  work  on  a  class 
of  mathematical  domains  called  coherence  spaces  [7,  8].  One  way  of  understanding  propositional 
linear  logic  is  to  see  it  as  a  modification  of  propositional  logic  which  takes  seriously  the  concept  of 
a  resource.  As  such  it  is  related  to  such  systems  a5  relevance  logic  which  incorporate  this  concept 
as  well  (see  [3]  for  a  full  discussion).  Resources  are  also  a  familiar  aspect  of  the  theory  of  Petri 
nets.  In  what  follows,  we  will  attempt  to  convince  the  reader  that  the  senses  in  which  linear  logic 
and  Petri  nets  deal  with  resources  have  many  things  in  common.  Indeed,  we  will  demonstrate  a 
translation  which  characterizes  the  relationship  exactly. 

However,  the  way  linear  logic  and  nets  represent  resources  is  only  a  part  of  what  we  feel  is  a 
much  more  important  common  characteristic  of  the  two  theories:  the  way  in  which  they  illustrate 
true  concurrency.  It  is  weU-known  that  nets  provide  an  intuitive  and  pictorial  way  of  seeing  many 
fundamental  ideas  of  concurrent  computation.  In  what  follows,  we  will  show  how  this  intuition 
may  also  be  seen  in  the  theories  and  proof  trees  of  (a  fragment  of)  linear  logic. 

Other  researchers  have  independently  looked  at  the  relationship  between  Petri  nets  and  linear 
logic.  The  work  of  Asperti  [1,  2]  follows  much  the  same  basic  intuition  that  we  discuss  below 
for  the  tensor  connective.  Carolyn  Brown  at  the  University  of  Edinburgh  has  proven  a  result 
similar  to  our  soundness  and  completeness  theorem  and  studied  a  fragement  of  linear  logic  formulae 
with  additional  connectives  [4].  Naxciso  Marti-Oliet  and  Jose  Meseguer  [11]  have  discussed  the 
relationship  between  Petri  nets  and  linear  logic  from  the  point  of  view  of  category  theory.  We 
would  like  to  acknowledge  the  assistance  of  Jean  Yves  Girard,  who  provided  much  of  the  inspiration 
for  this  investigation.  We  also  thank  Dexter  Kozen,  Prakash  Panangaden,  and  Andre  Scedrov  for 
ideas  and  encouragement  and  acknowledge  helpful  discussions  with  Eike  Best,  Ursula  Goltz,  Ugo 
Montanari,  and  Wolfgang  Reisig. 

Throughout  the  rest  of  the  paper  we  will  assume  some  familiarity  with  net  theory  and  proof 
theory.  Concepts  and  notations  related  to  former  can  be  found  in  [15]  and  [6].  For  the  latter,  [18] 
and  [17]  are  excellent  references. 

2  Relating  Nets  and  Theories 

In  this  section  we  outline  the  fragment  of  linear  logic  on  which  this  paper  will  be  concentrating. 
The  theory  wiU  be  given  in  the  form  of  a  Gentzen  style  sequent  calculus. 

A  tensor  formula  is  either  a  propositional  atom  or  the  tensor  product  A  0  5  of  tensor  formulas 
A  and  B.  A  tensor  sequent  is  a  pair  T  b  A  where  P  is  a  list  of  tensor  formulae.  A  tensor  theory 
is  a  set  of  tensor  sequents.  Of  course,  any  set  of  sequents  T  wiU  generate  a  tensor  theory  Th(T) 
which  is  the  least  set  of  sequents  containing  T  and  closed  under  the  rules  in  Figure  1.  We  say  that 
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Structural  Rules 


T,A,B,AhC 
T,B,A,A\-  C 


(Exchange) 


JP7  (Identity) 


ri-^  A,A\-B 
r,A  h  B 


(Cut) 


Logical  Rules 


Eh  A  Ah^ 
r,  A  h  A  ®  S 


(OR) 


r,A,-B  h  c 
r,  A  ig)  5  h  c 


(®L) 


Figure  1:  Structural  and  logical  rules  for  a  fragment  of  linear  logic. 

r  h  A  is  provable  in  T  if  F  f-  A  is  in  Th(T).  We  say  that  F  h  A  is  provable  if  it  is  in  Th(0).  Let 
us  say  that  a  pair  A  I — I  B  is  provable  if  A  h  5  and  B  A  are  both  provable.  It  is  not  hard  to  see 
from  these  axioms  that  the  tensor  connective  is  associative  and  commutative: 

Proposition  1  For  any  A, B, C ,  the  sequents  A®  B  \ — I  B  ig)  A  and  (A  (g  B)  igi  C  I — 1  A g  (B  g  C) 
are  provable.  | 

However,  the  tensor  connective  is  not  absorptive;  for  example,  the  sequent  Ag  A  I-  A  is  not  provable. 
It  is  therefore  possible  to  think  of  a  tensor  formula  as  a  multiset  (or  “bag”)  of  propositional  atoms. 
Given  a  tensor  formula  A,  let  M(A)  be  the  multi-set  of  propositional  atoms  determined  by  A.  It 
follows  from  the  proposition  that  tensor  formulae  A  and  B  such  that  M(A)  =  M(B)  are  equivalent, 
i.e.  A  I — IB.  Moreover,  sequents  F  h  A  and  A  1-  A  are  equivalent  in  the  sense  that  each  can  be 
derived  from  the  other  if  the  lists  F  and  A  determine  the  same  multi-set  of  propositions.  For  this 
reason,  we  wiU  treat  sequents  as  pairs  F  1-  A  where  F  is  a  multi-set. 

For  the  purposes  of  this  paper,  a  net  iV  is  a  set  Sn  of  places  together  with  a  set  Tn  of  pairs  of 
multi-sets  over  Sn-  A  pair  t  =  {'t,t')  €  iV  is  called  a  transition  of  the  net  with  pre-condition 't  and 
post-condition  t'.  Of  course,  this  is  only  one  of  the  many  flavors  of  nets  that  have  been  studied  in 
the  rich  literature  on  such  structures.  Nets,  as  defined  here,  are  similar  to  place/ transition-systems 
as  defined,  for  example,  in  [15].  However,  our  notion  of  net  has  less  structure  since  there  are  no 
capacities  and  a  transition  is  uniquely  determined  by  its  pre  and  post  conditions.  Moreover,  a  net 
in  our  sense  does  not  have  a  specified  initial  marking.  One  of  the  appealing  characteristics  of  nets 
is  the  way  they  lend  themselves  to  pictorial  representation.  For  example,  the  net  No  consisting  of 
the  pairs  ({A},{B,B,C})  and  ({B},{A})  is  pictured  as  a  labelled  graph  in  Figure  2. 

Before  we  offer  a  technical  definition  of  just  how  a  net  determines  a  theory,  we  will  attempt  to 
motivate  the  basic  idea  by  means  of  examples.  Consider  the  net  Ni  pictured  in  Figure  3.  In  this 
net,  if  we  are  given  a  token  on  the  condition  A,  then  it  is  possible  to  fire  the  event  r.  Firing  this 
event,  exhausts  the  token  on  A  but  provides  a  token  on  B.  Logically,  let  us  read  the  event  r  as  an 
axiom  A  h  B  meaning  “from  A  it  is  possible  to  obtain  B.”  Similar  ideas  apply  to  the  events  s,  t 
and  u  which  we  may  read  as  B  F  B  and  AFC  and  C  h  E  respectively.  Now,  event  v  requires  a 


4 


Carl  Gunter  and  Vijay  Gehlot 


A 


Figure  2:  Net  Nq. 


''  B 

1—0 

'  C 

1—0 


S  D 


u 


I 


F 


Figure  3;  A  net  Ni  with  concurrency  and  choice. 

token  on  D  and  a  token  on  E  in  order  to  fire  and  produce  a  token  on  F.  We  might  therefore  take 
D,E  F  as  the  logical  content  of  v.  In  summary,  let  Ti  be  the  set  of  axioms 

A\-  B  B\-  D 

D,E  h  F 

AhC  C\-E 

Do  these  axioms  somehow  characterize  the  net  “logically”?  If  one  interprets  the  comma  between 
the  D  and  E  in  the  way  that  one  ordinarily  does  in  logic,  this  tempts  one  to  think  oi  D,E  \-  F 
as  D  f\  E  F .  But  something  is  now  wrong  with  the  proposed  “logical  interpretation”  of  the  net. 
In  particular,  it  is  easy  to  check  that  A  \-  F  is  provable  from  the  axioms  Ti.  However,  if  one’s 
interpretation  of  A  h  F  is  “from  the  resource  A  one  is  able  to  obtain  the  resource  F,”  then  the 
deduction  is  evidently  incorrect.  The  problem  lies  in  the  fact  that  ordinary  propositional  logic  does 
not  support  properly  a  concept  of  “proof  resource.”  The  culprit  (in  this  case)  is  the  rule  from  first 
order  logic  which  gives  us: 

A  h  D  A\-  E 
A\-  D/\E 

This  rule  clearly  does  not  reflect  the  desired  intuition  about  resources.  If  I  can  use  $1  to  buy  a 
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Figure  4:  A  net  N2  with  a  critical  region. 

pepsi  and  $1  to  buy  a  coke,  then  I  can’t  expect  to  use  $1  to  buy  both  a  pepsi  and  a  coke.  Of  course, 
one  can  also  write  the  conjunction  rule  as 

A\-  D  A\-  E 
A, Ah  DAE 

but  this  only  begs  the  issue,  since  some  instance  of  the  thinning  rule: 

T,X,X\-Y 

T,X\-Y 

would  be  used  at  a  later  step  in  the  proof  to  remove  the  second  copy  of  A  and  this  rule  is  just  a.s 
suspect  eis  the  earlier  version  of  the  conjunction  rule.  To  deal  with  this  problem,  one  needs  a  logic 
in  which  the  thinning  rule  is  omitted  and  the  second  of  the  conjunction  rules  is  used  for  the  “and” 
connective  that  we  have  in  mind. 

The  proper  rules  are  those  given  in  Figure  1  for  the  linear  logic  tensor  connective  (g).  These 
rules  keep  track  of  the  resources  as  needed.  In  linear  logic,  the  sequent  A  h  F  is  not  provable  in  Ti. 
However,  it  is  possible  to  check  that  A,  A  H  F  is  provable  in  Ti,  as  we  expect  it  should  be.  There 
are,  in  fact,  several  proofs  of  A,  A  h  F  in  Ti;  three  of  these  are  listed  in  Figure  5  (on  page  7).  We 
wiU  come  back  to  these  proofs  later  to  discuss  how  they  relate  to  the  net  token  games  that  move  a 
token  from  the  marking  A,  A  to  the  marking  F. 

To  give  a  slightly  larger  example,  which  we  hope  wiU  suffice  in  giving  the  reader  the  general  idea, 
consider  the  net  N2  in  Figure  4.  This  net  corresponds  to  the  tensor  theory  T2  with  the  foUowing 


6 


Carl  Gunter  and  Vijay  Gehlot 


six  axioms: 


A  B\-C 

Ah  B®D 
A'h  B'®D 

C'®D\-  A'  B'hC 


As  one  might  expect,  it  will  never  be  the  case  that  from  starting  marking  C,C',D,  the  resource 
A  igi  A'  is  obtained.  More  precisely,  one  can  show  that  C®C' ®D\/ A®  A' ®A  for  any  choice  of 
linear  proposition  A. 

A  formal  definition  may  be  now  be  expressed  as  follows.  Let  A  be  a  net  and  let  S  be  the  set  of 
places  of  N.  These  will  be  the  propositional  atoms  over  which  we  form  a  set  of  tensor  sequents  as 
foUows: 

£{N)  =  {Ah  B  \  M(A)  =  ■<  and  M(B)  =  t'  for  some  t  6  T^r} 


We  wiU  refer  to  C{N)  as  the  tensor  theory  determined  by  N. 

On  the  other  hand,  let  T  be  a  set  of  tensor  sequents  in  a  language  with  propositional  atoms  S. 
The  theory  T  determines  a  net  M{T)  as  the  set  T^v'ir)  of  pairs  (M(A),  M(£))  such  that  A  h  5  is 
in  T.  It  is  clear  that  M{C{N))  =  N  for  any  net  N .  If  A'  H  B'  is  an  element  of  the  set  T  whenever 
there  is  a  sequent  Ah  B  inT  such  that  M(A)  =  M(A')  and  M(J3)  =  M(jB'),  then  it  will  also  be  the 
case  that  C{N{T))  =  T.  For  example,  the  net  N\  in  Figure  3  has  C{N\)  =  {A  h  5,  B  h  D,  Ah 
C,  Ch  E,  D®Eh  F,  E®Dh  F). 

As  the  reader  can  guess  from  the  examples,  a  marking  Af  on  a  net  N  corresponds  to  a  linear 
proposition  A  such  that  M(A)  =  M.  For  example,  the  marking  of  the  net  N\  in  Figure  4  is 
represented  by  the  proposition  C  ®  C  ®  D.  In  general,  we  have  the  following: 


Theorem  2  (Soundness  and  Completeness)  Given  a  net  N  and  markings  M  and  M' ,  the 
marking  M'  is  in  the  forward  marking  set  [M)  of  M  if  and  only  if  the  sequent  A  h  A'  is  provable  in 
the  linear  theory  £{N)  associated  with  N  for  any  tensor  formulae  A  and  A'  such  that  M(A)  =  M 
and  M(A')  =  M'.  | 


We  may  apply  the  Soundness  and  Completeness  Theorem  to  show  how  a  non-trivial  result  from 
net  theory  leads  to  a  result  for  a  fragment  of  linear  logic.  Given  a  finite  net  N,  it  is  decidable  whether 
M'  G  [M)  for  markings  M  and  M'.  This  result  is  the  culmination  of  a  body  of  research  which 
began  with  van  Leeuwen  [19]  and  has  been  worked  on  by  a  number  of  researchers  [16,  9,  12,  10,  13). 
Here  is  an  immediate  consequence: 

Corollary  3  Let  N  be  a  finite  net  and  C{N)  its  associated  linear  theory.  It  is  decidable  whether 
Ah  B  is  provable  in  the  theory  £{N)  for  tensor  formulae  A  and  B.  | 

Of  course,  the  Corollary  holds  only  for  linear  formulae  in  the  small  fragment  of  the  system  that  we 
have  discussed.  Getting  an  assessment  of  how  this  result  compares  to  known  results  about  linear 
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logic  involves  expanding  our  discussion  to  a  larger  fragment  of  the  calculus.  Since  rules  from  C{N) 
may  be  used  arbitrarily  often,  they  must  be  represented  as  linear  logic  propositions  using  the  “of 
course”  operator,  written  \A.  (Given  a  linear  proposition  A,  the  proposition  lA  represents  the  “pure 
propositional  content”  of  A.  In  the  current  context  we  may  think  of  it  as  an  unlimited  resource  of 
A’s.)  Linear  propositional  logic  with  the  !  operator  is  not  known  to  be  decidable.  The  result  above 
suggests  that  the  decision  procedure  for  this  calculus,  if  it  exists,  wiU  not  be  easy  to  find. 

Proof  1. 


3  Proofs  as  Computations 

Let  us  return  now  to  our  discussion  of  the  net  Ni  in  Figure  3  (on  page  4).  This  net  displays  some 
of  the  intuitive  representations  of  concepts  which  have  made  nets  an  appealing  model  for  both 
theoreticians  and  practitioners.  The  events  r  and  t  “compete”  for  the  resource  A  and  the  events  s 
and  u  are  capable  of  running  concurrently  if  they  have  the  necessary  resources  B  and  C.  There  is 
a  causal  dependency  between  r  and  s:  if  r  fires  then  s  will  be  enabled.  A  similar  dependency  holds 
between  t  and  u.  If  there  is  a  line  of  computation  which  passes  through  r,  5  and  another  which 
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passes  through  t,u,  then  these  must  “synchronize”  before  v  is  enabled.  Most  of  these  intuitions 
are  represented  in  one  form  or  another  in  the  proof  trees  of  the  linear  theory  C{Ni).  In  particular, 
the  cut  rule  corresponds  to  the  concept  of  causal  dependency  or  sequentialization.  For  example,  to 
prove  that  A  h  Z),  it  is  essential  to  use  a  cut.  This  relates  to  the  fact  that  the  event  r  must  take 
place  before  the  event  s  can  be  enabled.  Basically,  the  only  situation  in  which  the  cut  rule  is  never 
needed  for  a  proof  is  for  a  net  whose  theory  is  trivial  since  only  in  this  case  are  there  no  causal 
dependencies!  Hence,  for  the  theory  determined  by  a  non-trivial  net,  we  cannot  expect  that  cut 
elimination  is  possible. 

Given  an  initial  marking  {A,  A}  on  the  net  Ni,  consider  the  following  sequence  of  firings  to 
produce  F:  first  fire  r,  then  fire  t,  then  fire  s,  then  fire  u  and  then  fire  v.  We  can  represent  this  by 
the  following  expression: 


((((U  II  r);{t  II  1b));(5  II  1c));(1d  II 

where  the  semi-colon  represents  sequentialization,  the  parallel  operator  represents  concurrency  and 
an  expression  lx  is  the  “idle  event”  on  X.  This  computation  is  “maximally  sequential”  in  the 
sense  that  it  makes  no  real  use  of  the  possibility  of  doing  two  things  “at  the  same  time.”  This 
corresponds  to  a  linear  logic  proof  in  which  there  are  many  applications  of  the  cut  rule.  This  proof 
is  given  as  Proof  1  in  Figure  5.  But  there  are  other  ways  the  firing  sequence  from  A,AtoF  could 
be  carried  out.  For  example;  first  fire  r  and  t,  then  fire  s  and  u,  and  after  this,  fire  v.  The  following 
expression  represents  this  firing  sequence: 

{{r  II  t)-,{s  j|  u));t;. 

This  computation,  which  corresponds  to  Proof  2  in  Figure  5,  has  still  not  made  “maximal”  use 
of  concurrency,  although  it  is  better  than  the  first  firing  sequence.  Although  r  and  t  are  not 
constrained  to  fire  in  any  particular  order,  the  event  s,  for  example,  is  not  permitted  to  fire  until  t 
has  fired.  This  restriction  is  not  really  intrinsic  to  the  causal  dependencies  of  the  net.  On  the  other 
hand,  it  is  clear  that  no  firing  sequence  will  allow  v  to  be  fired  before  both  s  and  t  have  done  so. 
The  “best”  or  most  concurrent  firing  sequence  is  therefore  the  following:  fire  r  and  then  s  while 
also  firing  t  and  then  u,  after  this,  fire  v.  This  is  represented  by  the  expression; 

((r;s)  II  {i;u));v 

which  corresponds  the  Proof  3  in  Figure  5. 

Following  these  intuitions,  it  is  desirable  to  provide  a  set  of  rewrite  rules  which  will  take  proofs 
such  as  1  and  2  and  convert  them  to  a  “maximally  concurrent”  proof  such  as  3.  This  process 
resembles  the  cut  elimination  results  from  proof  theory,  but  must  differ  in  some  ways  since  the 
cut  elimination  is  being  carried  out  in  a  theory  in  which  cut  elimination  is  impossible.  A  similar 
situation  arises  for  cut  elimination  in  a  theory  with  equality  where  all  but  the  cuts  involving 
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equational  axioms  can  be  eliminated.  However,  the  “maximally  concurrent”  proof  we  desire  cannot 
be  obtained  by  a  straight-forward  translation  of  these  ideas.  Instead,  it  is  necessary  to  rely  on  other 
intuitions  about  the  correct  forms. 

4  Cut  reduction. 

In  this  section,  we  formalize  the  concepts  intuitively  discussed  in  the  previous  section.  Our  goal  is 
to  demonstrate  a  set  of  rewrite  rules  for  transforming  a  given  proof  into  a  “maximally  concurrent” 
proof  of  the  same  sequent.  We  begin  by  defining  essential  cuts  and  then  state  and  prove  the  cut 
reduction  theorem.  The  proof  is  based  on  giving  a  finite  set  of  proof  reduction  rules  which  is  shown 
to  be  strongly  normalizing. 

Definition  1  An  instance  of  the  cut  rule  in  a  proof  is  trivial  if  at  least  one  of  the  premisses  is  an 
axiom  of  the  form  A  h  A. 

Definition  2  An  instance  of  a  cut  rule  in  a  proof  is  called  essential  if  it  is  non-trivial  and  has  the 
form 


n-  A  A\-  B 
T\-  B 


Cut 


where  A  is  a  netformula. 

Theorem  4  (Cut-Reduction)  Given  a  net  N  and  its  associated  deductive  system  C{N).  If  a 
sequent  T  I-  A  is  provable  in  C{N),  then  there  is  a  proof  of  this  sequent  in  C{N)  such  that  all  cuts 
are  essential. 

Intuitively,  essential  cuts  seem  to  capture  dependencies  exactly  as  dictated  by  the  underlying 
net.  A  proof  is  cut-reduced  if  all  instances  of  cuts  in  it  are  essential. 

We  will  give  a  collection  of  rewrite  rules  for  proofs  and  show  the  existence  of  a  normalizing 
sequence.  We  will  then  strengthen  this  result  by  establishing  that  the  set  of  reduction  rules  is 
strongly  normalizing  .  The  theorem  above  will  immediately  follow  from  the  proposition  that  every 
normal  proof  is  cut-reduced. 

Remark:  Prawitz  [14]  distinguishes  “normal  form  theorem”,  “normalization  theorem”,  and 
“strong  normalization  theorem”.  In  his  terminology  then,  our  cut-reduction  theorem  is  a  normal 
form  theorem,  the  second  theorem  will  be  a  normalization  theorem,  and  the  la.st  one  will  be  a 
strong  normalization  theorem. 

We  begin  by  enumerating  transformations  on  proofs.  Assume  that  a  proof  V  ends  with  an 
inessential  cut,  i.e.  it  has  the  following  form: 
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ThA  A,A\-B 

— r:A>i — 

We  will  refer  to  the  left  and  right  sub-proofs  as  V  and  V" ,  respectively.  The  various  transfor¬ 
mations,  based  on  the  form  of  V  and  V",  are: 

1.  Axioms.  This  case  is  applicable  when  at  least  one  of  the  sub-proofs  is  an  ajdom. 

1.1  V  is  an  axiom.  We  have  the  following  transformation: 


Ah  A  A,A\-B 
A,  A  h  B 


A, Ah  B 


1.2  V"  is  an  axiom.  We  transform  V  as  follows: 


Th  A  Ah  A 
Th  A 


Th  A 


2.  Permutation.  This  rule  is  applied  when  at  least  one  of  the  sub-proofs  V  and  V"  terminates 
with  a  logical  rule  with  the  main  formula  being  different  from  the  cut  formula  A  or  with  an  essential 
cut.  Following  are  the  various  possibilities. 

2.1  Endsequent  of  V'  is  obtained  by  an  essential  cut  or  a  logical  rule  whose  main  formula  is 
different  from  the  cut  formula.  We  distinguish  the  following  cases. 

2.1.1  The  last  rule  is  a  (giL.  We  obtain  the  new  proof  as  follows: 


T,B,ChA 

T,B®ChA^  A,AhD 
T,B  ®  C,A  h  D 


T,B,ChA  A,AhD 
T,B,C,A  h  D 
T,B  ®C, Ah 


2.1.2  The  last  rule  in  V  is  an  essential  cut.  Note  that  since  we  allow  at  most  one  formula  in 
the  succedent  of  a  sequent,  the  last  rule  of  V'  cannot  be  a  ®R  in  2.1. 


Th  B  Bh  A 


Th  A 


AhC 


r  h  c 


Note  in  above  that  5  is  a  netformula. 

2.2  Similar  to  2.1  above  but  for  the  sub-proof  V".  We  distinguish  the  following  cases. 

2.2.1  The  last  rule  is  a  0L. 
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^,C,D,A\-B 
T\-A  A,C  ®D,A\-  A 
T,A,C®D  h  B 


0L 
Cut-.  A 


=i^ 


T\-A  A,C,D,A\-B 
T,A,C,DhB_ 

t,a,c®d\-b'" 


CuUA 


2.2.2  The  last  rule  is  a  (giR.  We  distinguish  following  two  cases. 
2.2.2. 1  Cut  formula  A  in  upper  left  sequent  of  the  last  rule  of  V". 


A',A\-B  A"l-C 


®R 


r  h  A  A\A",A\-  B®C 

T,A',A"\-B®C 


rt-A  A',A\-B 
T,A'\-B 


CuiiA 


A''\-C 


T,A',A"  B  ®C 


®R 


2.2.2.2  Cut  formula  A  in  upper  right  sequent  of  the  last  rule  of  V". 


A' B  A",  A  h  C 


®R 


r  t-  A  A",  A  h  C 


r  h  A _ A',A\A\-  B  ®C 

T,A\A"  B  ®C 


A'h  B 


T,A"  f-  C 


CuUA 


r,A',A"l-  B®C 


®R 


2.2.3  The  last  rule  of  V"  is  an  essential  cut.  In  this  case,  the  cut  formula  cannot  come  from  the 
upper  right  sequent  of  the  essential  cut  above.  Thus  we  have  only  one  case  to  consider. 


r  1-  A 


aut..B 

A',AVC 


r,A'i-c 


CuUA 


ThA  A',AI-.g 

r,A'K0_ 


CuUA 


B\-C 


r,A'  h  c 


CuUB 


Note  once  again  that  B  belongs  to  some  netaxiom  in  the  two  cases  above. 

3.  Logical.  This  is  the  case  where  the  cut  formula  is  the  main  formula  of  a  logical  rule  in  both 
V'  and  V  and  is  introduced  only  by  this  instance  of  the  rule.  The  transformation  in  this  case 
depends  on  the  outermost  logical  symbol  of  the  cut  formula  and  since  we  only  have  one  logical 
connective,  there  is  only  one  case  to  consider  here. 


h  Ai  r"  h  A2  Ai,A2,Ai-.g 

r,r"  h  Ai  0  A2 _ Ai  0  A2,A  H  .5 

■  r',r",Ah5 


®L 

CTii:Ai®A2 


V  h  Ag 

r' 

r",r',Ai-5 


Ai,  A2,  A  h  B 


A2,  A  I-  R 


Cut:A2 


Cut'.Ai 


T"  I-  A2 
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Remark:  It  may  seem  that  the  rule  2.1.2  does  not  appear  in  its  most  general  form  and  one  may 
be  tempted  to  consider  the  following  as  its  most  general  form: 


T\-  B 


B\-A 


T\-  A 


CuUB 


A,ylh  C 


r,A  i-c 


Cut-.A 


However,  such  a  form  is  not  only  redundant  but  incorrect  too.  First,  note  that  in  the  situation 
as  above,  the  comma  suggests  that  A,yl  h  C  is  obtained  by  a  (g)R  or  iS)L,  and  hence  2.2.1  or  2.2.2 
would  be  applicable.  An  attempt  to  give  a  reduction  rule  based  on  the  form  above  by  permuting 
the  the  two  cuts  will  make  the  cut  on  B  inessential  (unless  A  is  empty,  in  which  case  2.1.2  applies), 
thus  destroying  an  important  invariance  property  of  these  transformations.  Also,  note  that  in  the 
case  3  above,  the  transformation  splits  a  cut  into  two  cuts  but  with  cut  formulas  with  less  number 
of  logical  symbols.  The  transformation  as  presented  first  performs  a  cut  on  A\  and  then  on  A2. 
However,  we  could  have  done  a  cut  on  A2  before  A\  giving  us  another  transformation.  But  including 
one  or  the  other  or  both  does  not  affect  our  results. 

The  following  lemma  singles  out  an  important  property  of  the  above  transformations. 


Lemma  5  Let  V  be  a  proof  and  let  V  be  a  proof  obtained  from  V  by  the  applications  of  the 
transformations  above,  then  the  number  of  sequents  (nodes)  in  V'  (viewed  as  a  tree)  is  less  than  or 
equal  to  the  number  of  nodes  in  V,  i.e.,  the  number  of  nodes  in  a  proof  is  never  increased  by  the 
application  of  the  transformations  above. 


Proof:  Immediate.  | 

Definition  3  A  proof  V  is  in  normal  form  if  there  does  not  exist  a  proof  V  such  that  V  V 
(one  step  reduction)  by  the  transformations  above. 

Lemma  6  A  proof  P  is  in  normal  form  iff  it  is  cut-reduced.  | 

Proof:  (if  part)  Clearly  V  is  trivially  normal  if  it  does  not  contain  any  inessential  cut. 

(only  if  part)  Assume  on  the  contrary  that  V  is  normal  and  contain  inessential  cuts.  In  V  choose 
an  inessential  cut  above  which  there  is  no  other  inessential  cut.  Clearly  then  one  of  the  reduction 
rules  given  above  is  applicable  to  this  (sub)  proof  V'  in  V  depending  on  how  the  premisses  of  the 
(only)  inessential  cut  in  V'  are  obtained.  This  contradicts  the  assumption  that  V  is  normal.  Hence 
a  normal  proof  is  cut-reduced.  | 

The  following  lemma  is  our  main  lemma  which  shows  the  existence  of  a  normalizing  sequence 
of  reduction. 


Lemma  7  If  V  is  a  proof  of  T  A  which  contains  only  one  (inessential)  cut  occurring  as  the  last 
inference,  then  F  h  A  is  provable  with  no  inessential  cut. 
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The  proof  of  the  theorem  then  immediately  follows  from  the  above  lemma  by  an  easy  induction 
on  the  number  of  inessential  cuts  appearing  in  a  proof.  In  any  proof  consider  an  inessential  cut 
above  whose  lower  sequent  no  inessential  cuts  appear;  thus  satisfying  the  condition  of  the  lemma. 
According  to  the  lemma  this  (sub)  proof  can  be  transformed  into  another  (equivalent)  proof  which 
does  not  contain  this  cut.  In  doing  so,  rest  of  the  proof  remains  unchanged.  We  get  a  cut-reduced 
(equivalent)  proof  by  repeating  this  process  until  all  the  inessential  cuts  have  been  eliminated. 
Proof:  (of  the  main  lemma)  Easy  induction  on  the  number  of  nodes  in  a  proof  satisfying  the 
condition  of  the  lemma.  | 

The  following  is  now  immediate. 

Theorem  8  Let  V  be  a  proof.  Then  there  exists  a  sequence  of  reductions  such  that  P  =>*  V' ,  and 
V'  is  in  normal  form.  | 

The  following  definition  will  be  used  in  the  proof  of  our  next  theorem. 

Definition  4  The  grade  5  of  a  formula  A  is  the  number  of  ig)  contained  in  A.  The  grade  of  an 
inessential  cut  is  the  grade  of  its  cut  formula. 

Thus,  by  the  definition  above,  grade  of  an  essential  cut  is  0. 

Theorem  9  (Strong  Normalization)  There  is  no  infinite  reduction  sequence  beginning  with  any 
proof  V. 

Proof:  We  define  a  measure  on  proofs  and  show  that  each  one  step  transformation  reduces  this 
measure. 

Let  the  complexity  of  a  proof  be  a  pair  (a,  6),  where 

•  a  =  sum  of  the  grade  g  of  cut  formulas  of  all  inessential  cuts  in  the  proof. 

•  6  =  sum  of  the  nodes  above  all  inessential  cuts  (including  the  premisses  and  conclusion  of  the 
cut). 

Clearly,  a  cut-reduced  proof  has  complexity  (0,0). 

Now  consider  the  three  (main)  classes  of  the  transformations  above.  It  is  easy  to  see  that 
application  of  these  transformations  in  each  case  to  a  proof  reduces  its  complexity. 

Axiom:  Both  a  and  b  are  reduced. 

Permutation:  b  is  reduced  keeping  a  the  same. 

Logical:  a  is  reduced. 

Thus,  all  reduction  sequences  terminate.  | 

In  Appendix  A  we  have  written  out  how  the  rewriting  works  on  Proof  1  and  2  in  Figure  5. 
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5  Proofs  as  arrows, 

A  variety  of  publications  have  focused  on  the  category-theoretic  characteristics  of  Petri  nets.  In  this 
section  we  hope  to  demonstrate  that  the  proof  transformations  which  we  describe  in  the  previous 
section  are  compatible  with  at  least  one  elegant  theory  of  nets  as  categories.  To  this  end,  we 
note  how  proofs  can  be  interpreted  as  arrows  in  the  category  T[N~\  of  Degano,  Meseguer,  and 
Montanari  [5]  and  then  show  that  the  proof  reduction  rules  as  presented  above  are  sound  with 
respect  to  this  interpretation,  i.e.  they  transform  arrows  to  equal  arrows. 

Let  iV  be  a  net  and  C{M)  be  the  tensor  theory  determined  by  it.  Also,  let  Proofs (N)  denote 
the  class  of  proofs  in  the  theory  C(M)  and  let  Mor(N)  denote  the  class  of  morphisms  of  the  strictly 
symmetric  strict  monoidal  category  T[Af]  freely  generated  by  N.  We  define  our  interpretation 
I :  Proofs  (N)  — >  Mor{N)  as  follows,  where  in  writing 


n 

r  h  A 


we  mean  a  proof  IT  with  conclusion  P  h  A. 


1. 

I(A  h  A)  =  u  :  A  A 


2. 


I(Ai  (g)  A2  •  ■ 
where  t  is  in  A. 


An  b  0  ^2  •  •  •  (g  Bm)  =  t  :  Ai  0  A2  •  •  •  0  An  -*■  0  ^2  •  •  •  0  Bm, 


3. 


I 


/  \ 

Hi  n2 

r  h  A  A\-  B 

^  r,  A  h  A  0  £  J 


/0p:r0A— t-A05, 


where  /  :  P 


n2 


4. 


n 

P,A,5  h  c 
\T,A(^Bh-C  J 


n 

=  I(P,A,5  h  C) 


5. 


/  \ 

Hi  112 

PbA  A,A\-B 

\  T,AbB  j 


=  (/0i  *a)  0  5  :  r  0  A  5, 
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III  112 

where  f  :T  ^  A  =  1{T  A)  B.nd  g  :  A®  A  ^  B  =  I(A,  A  h  B). 

Proposition  10  The  proof  reduction  rules  are  sound  with  respect  to  the  interpretation  above. 

Proof:  We  just  consider  an  illustrative  case  here.  Consider  the  reduction  rule  2.2.2. 1.  The  function 
I  yields  an  arrow  corresponding  to  the  left  hand  side  as  follows.  Let  f  :  T  A,  g  :  A'  ®  A  B , 
and  h  :  A"  — >  C.  Then  we  have: 

(/  ®  iA'®A")  oig  ®h):T  ®  (A'  ®  A")  -*  B  ®  C 
=  if  ®  (iA'  <S)  lA"))  oig®h) 

=  ((/  ®  iA')  ®  iA")  °ig®h) 

=  (if  ®  lA')  og)®  (iA"  °  h) 

=  (if  ®  iA')  og)®  h 
—  I  (right  hand  side)  | 

In  view  of  the  above  proposition  and  the  strong  normalization  theorem,  the  following  is  imme¬ 
diate. 

Corollary  11  Every  proof  reduces  to  a  unique  normal  form  modulo  the  interpretation.  | 

It  has  long  been  argued  by  proof  theorists  that  a  notion  of  equivalence  of  proofs  based  on  mere 
provability  is  too  extensional  and  inadequate.  But  the  question  of  the  right  notion  of  equivalence  of 
proofs  stiU  remains  open.  Prawitz  [14],  for  the  system  of  Natural  Deduction  and  his  set  of  reduction 
rules,  conjectured  that  two  derivations  represent  the  same  proof  if  and  only  if  they  reduce  to  the 
same  normal  form.  Now  in  view  of  corollary  11  above  we  may  say  something  similar  for  the 
identification  of  the  derivations  in  a  tensor  theory.  However,  it  seems  that  such  an  identification 
does  not  quite  capture  the  intuitive  sense  of  equivalence  (based  on  processes)  that  we  have  in  mind 
for  net  computations  and  is  still  too  extensional.  For  example,  proof  2  and  proof  3  of  section  3 
would  be  identified  as  the  corresponding  arrows  are  equal  because  _0_  is  a  bifunctor.  However,  the 
process  interpretation  that  we  have  in  mind  should  not  result  in  such  an  identification.  Thus  the 
sense  in  which  proof  3  is  not  equivalent  to  proof  2  (and  in  fact  better)  is  lost  in  the  denotational  view 
that  we  have  presented  in  this  section.  We  are  currently  looking  at  how  to  attach  such  intensional 
interpretations  to  proofs  in  our  setting.  We  have  made  some  partial  progress  towards  this,  though 
mostly  via  some  ad  hoc  means. 

6  Choice  Situations 

We  have  so  far  restricted  attention  to  a  rather  small  fragment  of  linear  logic  because  this  fragment 
is  already  sufficient  to  illustrate  several  important  concepts  that  suggest  interesting  relationships 
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between  concurrent  computations  and  proofs.  However,  we  believe  that  this  is  really  only  the 
beginning  of  the  story.  To  give  the  reader  a  taste  of  how  the  theory  can  be  further  developed,  we 
wiU  give  two  simple  examples  that  illustrate  the  potential  role  of  the  linear  connectives  known  as 
“direct  product”  and  “direct  sum.” 

Given  linear  formulas,  A  and  B,  the  expression  .4  &  J9  is  a  linear  formula  pronounced  “A  direct 
product  B”.  The  &:R  rule  is 

r  h  A  T\-  B 
ri-A&H 


Intuitively,  AhB  can  be  obtained  from  the  resources  T  provided  these  resources  can  be  used  to 
obtain  A  and  can  also  be  used  to  obtain  B.  Note  carefully  how  this  differs  from  the  iS>R  rule  where 
the  resources  must  be  divided  in  two  parts — one  part  for  proving  A  and  the  other  for  proving  B. 
The  &L  rules  are 


T,A\-C 


r,A  1-  C 


r,A&BhC  T,AkB\-C 

Intuitively,  the  resources  C  can  be  obtained  from  A^B  provided  they  can  be  obtained  ezt/ierfrom 
A  or  from  B. 


ABC 


Figure  6;  A  (gi  5  ®  C  h  (P  ig)  C)  &(  A  g)  E) 

The  intuitive  explanations  given  above  are  meant  to  suggest  to  the  reader  the  idea  that  the 
direct  product  operator  represents  a  form  of  choice.  To  see  a  very  simple  example  which  we  hope 
will  be  convincing  enough  to  capture  the  reader’s  interest,  consider  the  net  pictured  in  Figure  6. 
Following  the  theory  that  we  have  developed  in  the  preceding  sections,  this  net  is  represented  by 
the  linear  theory  consisting  of  the  sequents  A  ®  B  \-  D  and  B  ^  C  E.  Now,  given  a  starting 
marking  of  one  token  on  each  of  A,  B  and  C,  it  is  clear  that  a  token  can  be  moved  to  at  most  one 
of  the  conditions  D  and  E.  One  might  say  that  V  E”  is  an  obtainable  marking,  but  “D  A  E” 
is  not.  On  the  other  hand,  it  seems  that  “H  A  is  obtainable  in  the  sense  that  either  of  the 

conditions  D  and  E  may  be  fulfilled  by  some  firing.  In  linear  logic  one  may  express  this  state  of 
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Q  p«p*i 

Figure  7:  Coca  Cola  implements  $1  l-  coke  0  pepsi. 

affairs  with  the  formula  (F)  ®  C)  ®  E).  Here  is  a  proof  of  the  proper  statment; 

A®B\-D  A\~A  B®C^E 

A®B,C\-D®C  A,B®C\-  A®E 

A®B®Ch  D®C®  A®  B  ®C  A®E 
A®B®C\-{D®C)  0  E) 

It  is  our  feeling  that  the  direct  product  operator  captures  a  form  of  external  choice.  On  the  other 
hand,  the  linear  disjunction  captures  a  concept  of  internal  choice.  Given  two  linear  propositions  A 
and  B,  one  proves  the  linear  disjunction  AeB  of  A  ajid  B  from  hypotheses  F  by  using  one  of  the 
following  rules: 

Ff-A  FhB 

n-AQB  FI-A05 

In  other  words,  the  resource  A  0  can  be  obtained  from  F  just  in  case  either  A  or  B  can  be.  On 
the  other  hand,  if  one  wishes  to  obtain  C  from  F  and  resource  A®  B,  then  it  must  be  shown  that 
C  can  be  obtained  from  both  A  and  B.  The  rule  is 

F,AI-C  F,J3I-C 
F,A0BI-C 


Coke 


Figure  8:  Pepsi  Cola  implements 


Q  p«p»i 

$1  1-  coke  0  pepsi. 


The  internal/external  distinction  can  be  illustrated  by  a  simple  example  which  takes  linear 
propositions  as  a  specification  language.  Let  us  assume  that  we  wish  to  contract  a  vendor  to  build 
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Figure  9:  An  implementation  with  choice:  $1  h  coke&pepsi. 

us  a  machine  for  dispensing  soft  drinks  for  $1  and  there  are  two  possible  flavors  of  drink  that  are 
available:  coke  and  pepsi.  If  we  do  not  actually  care  which  of  these  is  dispensed  when  a  dollar 
is  given  to  the  machine,  we  may  make  the  specification  $1  b  coke  ©  pepsi.  The  choice  of  which 
beverage  we  are  given  in  exchange  for  $1  will  then  be  internal  to  the  machine.  For  example,  the 
machine  which  the  Coca  Cola  company  might  design  is  pictured  in  Figure  7  and  would  dispense 
only  coke.  On  the  other  hand,  the  Pepsi  Cola  company  might  design  the  machine  in  Figure  8  which 
dispenses  only  pepsi.  In  both  cases,  the  tensor  theories  determined  by  the  nets  are  strong  enough 
that  it  is  possible  to  prove  the  proposition  $1  b  coke  ©  pepsi.  Thus  the  machines  have  met  the 
specification.  If,  on  the  other  hand,  we  wish  to  insure  that  there  is  an  external  choice  so  that  a  user 
can  pick  the  flavor  of  his  preference,  then  we  might  have  used  the  specification  $1  b  coke&pepsi. 
This  specification  is  not  met  by  either  of  the  machines  in  Figures  7  and  8  since  this  sequent  is  not 
provable  in  either  of  the  linear  theories  associated  with  these  machines.  On  the  other  hand,  the 
net  in  Figure  9  does  meet  the  specification,  since  its  associated  theory  is  strong  enough  to  prove 
the  specifying  sequent. 

More  research  is  needed  to  understand  the  possible  significance  of  linear  connectives  in  speci¬ 
fication  languages.  While  the  example  seems  reasonable,  it  is  worth  keeping  in  mind  that  several 
other  nets  would  meet  the  desired  specification.  For  example,  the  net  associated  with  the  following 
theory 

$1  b  coke 
$1  b  pepsi 
pepsi  b  $1 

which  gives  a  $1  in  exchange  for  a  pepsi,  also  meets  the  specification!  On  the  other  hand,  the  net 
associated  with  the  sequent  $1  b  coke  ®  pepsi  which  dispenses  both  a  coke  and  a  pepsi  for  each  $1 
does  not  implement  the  specification. 

There  are  several  more  linear  connectives  which  we  will  not  discuss.  No  account  with  which 
we  are  familiar  has  addressed  the  linear  logic  negation,  which  seems  to  represent  the  dual  of  a 
resource — a  “debt”  perhaps.  A  treatment  of  negation  would  lead  to  an  understanding  of  the  dual 
of  the  tensor  called  the  par  which  seems  to  represent  a  concept  of  “concurrent  debts”.  We  mentioned 
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earlier  the  unary  operator  !  which  represents  an  unhmited  resource.  This  operator  plays  a  subtle 
role  in  the  theory  we  have  exposited;  work  of  Carolyn  Brown  [4]  provides  helpful  insight.  All  of  the 
linear  logic  connectives  seem  to  have  their  own  significance  in  terms  of  computation  on  nets.  (We 
have  included  a  list  of  some  of  the  rules  of  linear  logic  in  Appendix  B.)  Work  on  the  exploitation 
of  these  ideas  is  likely  to  be  a  profitable  for  the  study  of  both  concurrency  and  proof  theory. 
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A  Sample  Proof  Transformation 

We  give  some  examples  of  cut-reduction  below.  At  each  step  of  the  reduction,  the  inessential  cut 
to  which  a  reduction  rule  is  applied  is  denoted  Cut\.  Other  choices  of  inessential  cuts,  if  any,  at 
a  step  to  which  a  rule  could  have  been  applied  are  denoted  Cut.  Remaining  inessential  cuts  are 
denoted  Cut. 

Example  1 


B  h  D 


Ch  E 


AhB 


A\-C 


A,A\-B<^C 


B,C  D  ®  E 
B  ®  C  D  ®  E 


A,  A\-  D  ®  E 


D®Ei-  F 


A,A\-  F 


B\-  D 


C\-  E 


A\-  B 


B,C  V"  D  ®  E I 


AhC 


A,C  D  ®  E 


A,A\rD®  E 


D®E\-  F 


A,Ah-  F 


Ah  B  Bh  D 
Ah  B 


Ch  B 


AhC  A.^hB^Bj— 

A,AhB®B  ^ 

A,Ah  B 


B(g)Bh  B 


Example  2. 


AhB  BhB  AhC 

A  h  B  AhB 


Ch  B 


A,  A  h  B  (8)  B 


B(8)Bh  B 


A,Ah  B 


AhC  BhB 


Ah  A  AhB  A,Bh  C0B 

A,AhA®B  *  A®BhC0B  | 
A,AhC0B 


BhB 


ChC 


A,Ah  B®C 


B,Ch  B®C 
C  ®  B  h  B  ®  C 


A,  A  h  B  ®  B 


BhB  Ch  B 
B,  C  h  B  ®  B 
B®ChB®B®^ 


A,A  h  B 


B®  Bh  B 


AhC 


BhB 


A  h  A  A,  B  h  C®  Bp^ 

AhB  A,BhC®B„^ 

A,Ah  C®  B _ _ 

A,  A  h  B  ®  C 


BhB 


ChC 


B,Ch  B®C 
C  ®  B  h  B  ®  C 


A,  A  h  B  ®  B 


BhB 


C  h  B 


B,Ch  B®B 
D  ®C  D  ®  E 


A,Ah  B 


B®Bh  B 
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A\-C 


Bh  B 


A\-  B 


A,B±C®B 


■  8R 


B\-  D 


Ch  C7 


1.1. 


A,A\-C®B 


Cut 


B,C\-  D®C 


■  ®R 


C  ®  B  D  ®  C 


A,A\-  D®C 


®L 

Cut 


Db  D 


C\-  E 


A,Ai-  D  ®  E 


D,C\-  D®E 


■  ®R 


D  ®  C  ^  D  ®  E 


®L 

Cut 


D®Eh  F 


A,Ah  F 


Cut 


Ah  B 


Bh  B 


AhC 


Ah  B 


Cut 


2.2. 2>2 


A,  Ah  C  ®  B 


■  ®R 


BhD 


ChC 


B,C  h  D  ®  C 


■  0R 


C  ®  B  h  D  ®  C 


A,  Ah  D  ®  C 


®L 

Cut 


Dh  D 


Ch  E 


D,Ch  D®E 
D  ®  C  h  D  ®  E 


A,  Ah  D  ®  E 


■  ®R 
®L 
Cut 


D®EhF 


A,Ah  F 


Cut 


BhD 


ChC 


AhC 


Ah  B 


1.2, 


A,AhC®B 


■  ®R 


B,Ch  D®C 


■  ®R 


C  ®  B  h  D  ®  C 


®L 


Dh  D 


Ch  E 


A,  A  h  D  ®  C 


Cut 


D,Ch  D®E 


■  ®R 


D  ®  C  h  D  ®  E 


A,  Ah  D  ®  E 


®L 

Cut 


D  ®  E  h  F 


A,  A  I-  F 


Cut 


Ah  B 


BhD 


ChC 


AhC 


B,Ch  D®Ct 


■  ®R 


A,  B  h  D  ®C 


Cut  I 


A,  Ah  D  ®  C 


Cut 


Dh  D 


Ch  E 


A,AhD®E 


D,C  h  D  ®  E 


■  ®R 


D  ®  C  h  D  ®  E 


0L 

Cut 


D®Eh  F 


A,Ah  F 


Cut 


AhC 


ChC 


AhC 


Cut 


BhD 


2. 2.2,1 


Ah  B 


A,  B  h  D  ®  C . 


®R 


A,  Ah  D  ®C 


Cut 


Dh  D 


Ch  E 


D,C  h  D  ®  E 


®R 


D  ®  C  h  D  ®  E 


A,AhD®E 


®L 

Cut 


D  ®  E  h  F 


A,Ah  F 


Cut 


AhC 


BhD 


Ah  B 


A,Bh  D®C 


■  ®R 


Dh  D 


Ch  E 


A,  Ah  D  ®C 


Cut 


D,C  h  D  ®  E 


■  ®R 


D  ®  C  h  D  ®  E 


A,  Ah  D  ®  E 


0L 

Cut 


D®EhF 


A,Ah  F 


Cut 


Ah  B 


BhD 


AhC 


Ah  D 


Cut 


Dh  D 


Ch  E 


2. 2. 2.2 


A,  Ah  D  ®  C 


■  ®R 


D,C  h  D  ®  E 


■  ®R 


D  ®  C  h  D  ®  E 


®L 


A,  Ah  D  ®  E 


Cut 


D®EhF 


A,Ah  F 


Cut 
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Ah  B 


Bh  D 


Ah  D 


Cut 


AhC 


B.ChD^E, 


®R 


A,  D  h  D  ®  E 


Cut\ 


A,  A  h  D  ®  E 


Cut 


D  ®  E  h  F 


A,Ah  F 


Cut 


2. 2. 2. 2 


AhC 


Ch  E 


Ah  B 


Bh  D 


Ah  D 


Cut 


Dh  D 


Ah  E 


A,  D  h  D  ®  E 


A,  Ah  D  ®  E 


•  0R 


A,Ah  F 


Cut 


D®Eh  F 


Cut 


Ah  B 


Bh  D 


Ah  D 


Cut 


2. 2. 2. 2 


Ah  D 


Dh  D 


[EH 


AhC 


Ch  E 


Ah  E 


Cut 


A,AhD®E 


®R 


D  ®  E  h  F 


A,Ah  F 


Cut 


AhC 


1.2. 


Ah  E 


Ch  E  ^  Ah  B 
-  Cut  - 


Bh  D 


Ah  D 


Cut 


A,AhD®E 


®R 


D®Eh  F 


A,Ah  F 


Cut 
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Abstract 

This  paper  discusses  the  relevance  of  a  form  of  cut  elimination  theorem  for  linear  logic  tensor 
theories  to  the  concept  of  a  process  on  a  Petri  net.  We  base  our  discussion  on  two  definitions 
of  processes  given  by  Best  and  Devillers.  Their  notions  of  process  correspond  to  equivalence 
relations  on  linear  logic  proofs.  It  is  noted  that  the  cut  reduced  proofs  form  a  process  under  the 
finer  of  these  definitions.  Using  a  strongly  normalizing  rewrite  system  and  a  weak  Church- Rosser 
theorem,  we  show  that  each  class  of  the  coarser  process  definition  contains  exactly  one  of  these 
finer  classes  which  can  therefore  be  viewed  as  a  canonical  or  normal  process  representative.  We 
also  discuss  the  relevance  of  our  rewrite  rules  to  the  categorical  approach  of  Degano,  Meseguer, 
and  Montanari. 


1  Introduction 

It  has  often  been  useful  to  take  ideas  from  proof  theory  and  look  at  their  computational  significance. 
One  very  fruitful  line  of  investigation  has  been  the  use  of  the  Curry-Howard  correspondence — the 
“propositions  as  types”  idea — as  a  way  of  seeing  proofs  as  programs  and  types  as  specifications. 
This  correspondence  reveals  an  analogy  between  cut  elimination  in  systems  of  natural  deduction 
and  the  reduction  of  lambda-terms,  thus  strongly  connecting  the  study  of  a  central  proof-theoretic 
idea  (with  a  history  dating  back  at  lecist  to  the  1930’s)  with  a  central  computational  concept  in 
sequential  functional  programmming. 

Another,  more  recent,  line  of  investigation  with  a  kinship  to  this  sequential  theory  concerns  the 
relationship  between  certain  kinds  of  proofs  and  concepts  in  concurrency.  A  number  of  authors 
have  discussed  the  idea  of  relating  concurrent  computations  as  represented  by  Petri  nets  to  proofs 
in  linear  logic  [7].  One  line  of  research  seeks  to  use  the  fact  that  nets  give  rise  to  a  monoid  structure 
and  can  therefore  be  used  to  model  linear  logic  through  the  use  of  a  phase  semantics  [6].  In  this  way 
a  net  can  be  viewed  as  a  model  of  the  linear  connectives  in  which  there  is  a  correspondence  between 
the  truth  of  a  linear  sequent  in  the  model  and  the  reachability  relation  on  the  net.  However,  most  of 
the  research  [8,  9,  1,  4]  has  focused  on  the  idea  that  a  net  may  be  viewed  as  a  theory  in  a  fragment 
of  linear  logic  (the  tensor  theory  to  be  precise).  In  particular,  when  things  are  viewed  in  this  way, 
there  is  a  precise  correspondence  between  concurrent  computations  on  a  Petri  net  and  linear  logic 
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proofs  in  its  associated  theory.  This  opens  a  way  to  investigate  a  transfer  of  ideas  between  proof 
theory  and  the  theory  of  concurrent  computation. 

The  purpose  of  this  paper  is  to  look  at  the  significance  of  the  cut  ehmination  theorem  of  linear 
logic  tensor  theories  in  the  context  of  computations  on  Petri  nets.  The  role  of  cut  in  this  theory  is 
somewhat  different  from  cut  in  the  context  of  “propositions  as  types” .  A  linear  proof  corresponds  to 
a  net  computation  and  the  ehmination  of  a  cut  corresponds  to  a  transformation  of  that  computation, 
rather  than  an  enactment  of  the  computation.  The  authors  of  this  paper  have  suggested  before  [8] 
that  cut  ehmination  corresponds  a  form  of  optimization  in  which  a  computation  is  transformed 
into  a  “more  concurrent”  computation.  It  is  our  goal  in  this  paper  to  show  how  this  view  of  the 
significance  of  the  cut  fits  into  a  theory  of  Petri  net  processes. 

In  particular,  we  show  that  hnear  logic  cut  ehmination  provides  a  way  of  understanding  the 
relationship  between  two  definitions  of  the  notion  of  a  net  process  studied  in  Best  and  DeviUers  [3]. 
We  wih  define  a  pair  of  relations  S  and  T  on  hnear  logic  proofs  which  corresponds  to  two  of  the 
concepts  of  net  process  discussed  in  [3].  The  relation  T  is  coarser  than  S  and  relates  some  computa¬ 
tions  which  display  different  levels  of  dependency  in  their  descriptions  (i.e.  one  description  permits 
two  things  to  be  done  at  the  same  time  while  the  other  description  sequentiahzes  them).  What  we 
wish  to  show  is  that  there  is  a  unique  ^-equivalence  class  of  processes  in  each  T-equivalence  class 
r  which  can  be  viewed  as  a  “maximaUy  concurrent”  representative  of  r.  Moreover,  the  members 
of  this  ^-equivalence  class  wih  be  exactly  the  set  of  cut  reduced  proofs  in  r.  Since  we  can  demon¬ 
strate  a  strongly  normahzing  rewrite  system  for  hnear  proofs  which  preserves  T-equivalence,  we 
can  therefore  view  the  distinguished  <S-equi valence  class  in  r  as  a  normal  representative  process  for 
it.  In  general,  normal  representatives  of  classes  of  T  may  be  viewed  as  the  “maximaUy  concurrent” 
computations  on  a  net. 

Our  first  primary  technical  result  is  a  strong  normahzation  theorem  (Theorem  2)  for  a  set  of 
rewrites  which  preserve  T-equivalence.  Termination  for  the  rewrites  can  be  proved  using  a  measure 
on  cut  formulas  and  the  number  of  nodes  in  a  proof  tree.  Our  second  primary  technical  result  is 
a  weak  Church-Rosser  theorem  (Theorem  3)  for  the  action  of  the  rewrite  system  on  equivalence 
classes  of  S.  Confluence  of  our  rewrite  system  therefore  foUows  from  Newman’s  Lemma.  This  result 
is  then  used  to  show  that  there  is  a  unique  normal  process  representative  in  each  equivalence  class 
of  T(Theorem  4). 

We  begin  by  discussing  some  relevant  work  and  definitions  in  section  2  where  we  also  define  the 
two  equivalence  relations.  In  section  3  we  give  a  set  of  rewrite  rules  on  proofs  which  are  shown  to  be 
strongly  normalizing.  Section  4  contains  our  second  main  result  where  we  prove  the  Church-Rosser 
property  for  the  induced  rewrite  rules  on  iS-equivalent  classes.  We  then  define  an  equivalence 
based  on  this  notion  of  reduction  and  show  that  this  equivalence  coincides  with  T-equivalence 
giving  us  the  desired  theorem  about  unique  process  representatives.  Finally,  in  section  5  we  discuss 
relationship  between  reduction  on  proofs  and  rewrite  rules  for  arrows  which  arise  by  interpreting 
proofs  as  arrows. 

2  Two  theories  of  processes. 

In  this  section  we  introduce  the  background  on  processes  and  linear  logic  needed  to  understand 
the  central  results  of  the  paper.  First,  let’s  start  with  an  example.  Consider  the  net  pictured  in 
Figure  1.  It  has  six  places,  drawn  as  circles  and  marked  A,  B,  C,  A',  B' ,  C'  and  it  has  five  transitions, 
written  as  rectangles  and  marked  r,s,r',s',t.  The  two  closed  circles  on  the  places  A  and  A'  are 
tokens  which  indicate  the  availability  of  the  “resources”  A  and  A'.  In  the  configuration  in  the 
picture,  the  transitions  r  and  r'  are  enabled  by  the  fulfillment  of  their  preconditions  A  and  A' 
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respectively.  Dynamically,  computations  proceed  by  the  firing  of  transitions.  E  transition  r  fires, 
for  example,  then  the  token  is  removed  from  A  and  placed  on  its  postcondition  B;  the  transition  r 
is  now  disabled  since  its  precondition  A  is  no  longer  filled.  We  may  also  speak  of  the  concurrent 
firing  of  r  and  r'  in  the  starting  configuration  of  Figure  1  since  there  is  no  dependency  between 
their  pre-conditions. 

For  formal  definitions  we  refer  the  reader  to  recent  publications  in  LICS  [10,  5].  For  this  paper 
we  win  take  it  as  a  working  definition  that  a  net  (or,  to  be  more  precise,  a  place/transition  net)  is  a 
set  of  pairs  of  multisets  over  a  set  5  of  places.  This  is  reafiy  a  special  case  of  the  definition  of  a  net 
in  [10,  5]  where  distinct  trajisitions  with  the  same  pre  and  post  conditions  are  permitted,  but  the 
restriction  simpHfies  our  notation  since  it  avoids  the  need  to  label  the  hnear  sequents  to  preserve  a 
precise  correspondence  between  nets  and  linear  tensor  theories.  For  this  preUminaxy  discussion,  it 
win  be  convenient  to  utilize  their  categorical  treatment  of  nets  axid  write  transitions  a.s  arrows  in 
a  category  with  a  binary  operator  ®  on  its  objects.  In  this  notation,  the  transitions  in  the  figure 
may  be  viewed  as  arrows: 

r  :  A-*  B  s  :  B  C 

t :  B  ®  B'  ->■  C  ®C' 
r':A^B  s' :  B -*  C 

There  axe  two  operations  on  arrows,  li  f  :  X  -*  Y  and  g  :  Y  — *■  Z,  then  f)g:X-^Zvs  the 
composition  of  /  and  g.  11  f  :  X  -*  X'  and  g  :Y  -*Y',  then  f  ®  g  :  X  ®Y  T  (g)  E'  is  the  tensor 
product  of  /  amd  g.  Starting  with  the  basic  transitions,  these  operations  generate  a  language  of 
computations  on  the  net.  Intuitively  we  read  /;?  as  the  sequentialization  of  /  and  g:  “first  do  / 
and  then  do  p”.  We  read  f  ®  g  zs  the  concurrent  performance  of  operations  /  and  g:  “do  /  axid  g 
at  the  same  time”. 

Looking  again  at  Figure  1,  here  axe  four  sample  computations  of  type  A  ®  A'  — >■  C  0  C'  on  the 
net  N: 

f  =  (r0  A');(s0  A');(C0r');(C0s') 

/'  =  (r  0  r');(5  0  s') 
g  =  (r0  A');(j5  0r');t 
g'  =  (r0r');t 

where  the  idle  transition  (identity  map)  on  a  place  X  is  written  simply  as  X.  Much  of  the  research 
on  nets  (and,  indeed,  concurrency  as  a  whole)  has  focused  on  the  question  of  when  two  computations 
such  as  the  ones  above  are  “essentially  the  same”.  In  the  case  of  the  computations  above,  one  may 
well  expect  to  distinguish  between  processes  /  and  g,  for  example,  since  one  of  these  computations 
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Structural  Rules 


AHA 


(Identity) 


rHA  A, Ah  5 
r,A  h  5 


(Cut) 


Logical  Rules 


r  h  A  Ah  B 
r,  A  h  A  ®  5 


(®R) 


r,A,5hC 

T,A®B\-C 


(®L) 


Figure  2:  Structural  and  logical  rules  for  the  tensor  fragment  of  linear  logic. 

performs  transitions  s  and  s'  but  not  t,  while  the  other  performs  t  but  neither  s  nor  s'.  On  the 
other  hand,  it  is  debatable  whether  /  and  /'  or  g  and  g'  should  be  identified.  What  this  comes 
down  to  is  a  question  of  whether  ®  is  a  functor  or  not,  i.e.  is  it  the  case  that  the  equation 

{u  0  v);  {u'  0  v')  -  (u;  u')  0  (u;  v') 

holds  for  arbitrary  computations  u,  u',  u,  v'  which  make  the  equation  type  correct?  Arguing  one  way, 
this  is  a  pleasing  equational  property  which  is  supported  by  a  computational  intuition  that  the  only 
real  difference  between  the  left  and  right  hand  sides  of  this  equality  is  the  order  in  which  things  are 
done.  Arguing  against  functoriality,  it  seems  that  computations  /'  and  g'  are  somehow  “better” 
than  /  and  g,  respectively,  since  they  allow  more  independent  computations  to  be  performed 
concurrently.  For  example,  /  does  not  perform  the  transition  r'  until  after  r  is  complete,  but  this 
is  unnecessary  since  r  and  r'  can  be  done  at  the  same  time  as  in  /'. 

The  dividing  line  between  a  theory  of  processes  which  identifies  /  with  f  and  g  with  g'  and 
a  theory  which  does  not  make  these  identifications  has  been  carefully  examined  in  [3]  and  [5]. 
We  will  not  introduce  the  theory  in  the  form  that  they  have  done,  but  instead  present  it  as  an 
equational  theory  of  proofs.  The  reader  can  check  that  our  presentations  of  the  relations  S  and 
T  for  proofs  as  given  below  correspond  to  the  equational  theories  with  these  names  presented  by 
Degano,  Meseguer,  and  Montanari  [5]  in  the  last  LICS  symposium. 

A  linear  tensor  formula  is  either  a  propositional  atom  or  has  the  form  A  0  5  where  A  and  B  are 
tensor  formulas.  A  linear  logic  tensor  sequent  is  a  pair  F  h  A  where  A  is  a  tensor  formula  and  F  is 
a  multiset.  The  rules  for  deriving  sequents  of  this  fragment  of  linear  logic  are  given  in  Figure  2.  A 
tensor  theory  is  a  set  of  pairs  A\-  B  where  A  and  B  are  tensor  formulas.  Given  a  net  N  with  places 
5,  the  associated  tensor  theory  is  the  set  of  all  sequents  A\-  B  where  A  and  B  are  tensor  formulas 
formed  over  atoms  from  S  such  that  the  pair  of  multisets  {M{A).,M{B))  determined  by  A  and  B  is 
an  element  of  the  net  N .  There  is  a  precise  correspondence  between  computations  on  N  and  proofs 
over  the  associated  theory  in  which  uses  of  axioms  in  a  proof  correspond  to  firings  of  transitions  on 
the  net  and  uses  of  the  rules  for  tensor  on  the  right  and  cut  correspond  to  the  tensor  and  composition 
of  computations  respectively.  We  omit  a  further  discussion  of  this  correspondence  here  since  it  has 
been  exposited  adequately  elsewhere  ([9]  provides  a  rigorous  treatment  for  example).  The  remainder 
of  this  paper  will  be  concerned  with  computations  as  represented  by  proofs. 

We  now  begin  a  more  formal  discussion  of  proofs  and  the  cut  rule.  In  the  presence  of  proper 
axioms  it  is  not  possible  to  eliminate  cuts.  Our  concept  of  cut  reduction  will  be  based  on  a  definition 
of  a  normal  form  for  proofs  and  a  system  of  rewrite  rules  for  proofs  not  in  normal  form.  We  say 
that  an  instance  of  the  cut  rule  in  a  proof  is  trivial  if  at  least  one  of  the  premisses  is  an  axiom  of 
the  form  A  h  A.  A  linear  formula  A  is  said  to  be  a  netformula  (of  T)  if  it  appears  in  one  of  the 
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formulas  of  the  theory  T.  An  instance  of  a  cut  rule  is  said  to  be  essential  (in  a  proof  in  the  theory 
T)  if  it  is  non-trivial  and  has  the  form 


rh  A  A\-  B 


Th  B 


Cut 


where  A  is  a  netformula.  A  proof  is  said  to  be  in  normal  form  if  all  of  its  cuts  are  essential.  We 
will  discuss  normalization  of  proofs  in  the  next  section. 

Definition  1  Let  A  be  a  net.  The  equivalence  relation  S{N)  on  proofs  is  defined  as  the  smallest 
equivalence  relation  satisfying  the  following  equations  between  proof  trees. 


n  n' 

(1) 

T,B,C,A[-  D®  E 
T,B®C,A^  D®  E® 


n 

n' 

T,B®C\-D®  A\-E 
T,B®C,A'^  D®E  ®^ 


(2) 


n  n' 

r  H  A  A\-  B 

r,Ah  A®  5 


®R 


n'  n 

A^  B  r  1-  A 
A,T  h  B  ®  A 


®R 


n 

(3)  T,B,C,A,D,E\-  F 

T,B®C,A,D,E^ 
T,B®C,A,D®Eh  F® 


n 

T,B,C,A,D,E\-  F  , 
T,B,C,A,D®El-  F®^ 
T,B®C,A,D®Ei-  F® 


n  n' 

(4)  rhA  Abc  n" 
r,AbA®5  ^  AhC 
T,A,A\-  A®B  ®C  ® 


n'  n" 

n  AhC  AhC„„ 

LhA  A,Ah5®C  ® 

r,A,AbA®5®C  ® 


(5) 


n  n' 

LhA  A,  A 1-5  n" 

r,  A  h  5  A,B\-C 

r,A,Al-C 


Cut 


n 

LhA 


n' 

A, Ah  5 


n" 

A,5h  C 


A, A, AhC 


Cut 


r,A,AhC 


Cut 


n  n'  n 

r,5,chD  D\-E  ^  =  r,5,Ch5  n' 

T,B,Ci-E  _  T,B®C\-D  D  h  E 

T,B®C^E®^  T,B®ChE 


(6) 


e-cut 
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In  equation  (6)  above,  ‘e-cut’  means  that  the  corresponding  cut  is  an  essential  one.  The  equa¬ 
tions  above  may  be  read  as  defining  associativity  and  commutativity  of  proofs  in  presence  of  proper 
axioms.  We  now  define  another  equivalence  relation  T{N)  on  proofs  to  be  the  least  equivalence 
relation  generated  by  the  equations  defining  S  plus  the  following  equation. 


n"  n'" 

n  n'  A\-c  I, 

G)  ri-.4  AhB  A,BhC®jD  ® 

r,Ah>l®5  A®BhC®I)^ 


n  n"  n'  n'" 

ThA  AhC  ^  A\-  B  Bh  D 

r  h  c  r  t-  z) 

r,Ahc®z> 


This  last  rule  corresponds  to  the  functoriahty  of  the  tensor  operation.  Via  translation,  the 
relations  S{N)  and  T{N)  as  we  have  just  defined  them  correspond  to  the  theories  iS[iV]  and  T[iV] 
as  given  in  [5].  Their  results  there  demonstrate  the  .correspondence  between  these  equivalence 
classes  of  proofs  and  the  processes  in  [3].  We  will  therefore  refer  to  equivalence  classes  of  proofs  in 
S{N)  and  T(N)  as  <S-processes  and  T-processes  respectively  (dropping  the  N  when  it  is  understood 
as  fixed). 


3  Strongly  normalizing  cut  reduction. 


In  this  section  we  give  a  set  of  reduction  rules  for  proofs  and  show  that  they  are  strongly  normalizing. 
This  wiU  provide  the  desired  algorithm  for  finding  the  normal  representative  of  a  T-process. 
Assume  that  a  proof  V  ends  with  an  inessential  cut  and  has  the  following  form: 


V: 


ri-A  A,Ah5 

r,AI-5 


Cut-.A 


We  will  refer  to  the  left  and  right  sub-proofs  as  V  and  V",  respectively.  We  will  divide  these 
reduction  rules  in  three  classes — axiom,  permutation,  and  logical — and  give  an  illustrative  trans¬ 
formation  in  each  class. 

1.  Axioms.  This  case  is  applicable  when  at  least  one  of  the  sub-proofs  is  an  axiom.  When  V' 
is  an  axiom,  we  have  the  following  transformation: 


AHA  A,A\-B 
A,A\-  B 


Cut:A 


A,Ah  B 


2.  Permutation.  This  rule  is  applied  when  at  least  one  of  the  sub-proofs  V'  and  V"  terminates 
with  a  logical  rule  with  the  main  formula  being  different  from  the  cut  formula  A  or  with  an  essential 
cut.  For  the  case  when  the  last  rule  of  V"  is  a  (8)R  and  cut  formula  A  is  in  upper  left  sequent  of 
the  last  rule  of  V" ,  we  have  the  following  rewrite: 


A',A  H  5 


A"HC 


r  H  A 


A',A",AH5®C 


OR 


r.  A',  A"  H  5  ®  C 


CutA 


r  H  A  A',A  H  R  „  . 

r,A'  H  5  A"  H  C  _ 

r,  A',  A"  H  £  ®  C 


3.  Logical.  This  is  the  case  where  the  cut  formula  is  the  main  formula  of  a  logical  rule  in  both 
V'  and  V”  and  is  introduced  only  by  this  instance  of  the  rule.  The  transformation  in  this  case 
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depends  on  the  outermost  logical  symbol  of  the  cut  formula  and  since  we  only  have  one  logical 
connective,  there  is  only  one  case  to  consider  here. 


r'HAi  T"\-A2  Ai,A2,A\-B 

r',r"hAi0A2  Ai®A2,Ai-B 

r',  r",  A  h  5 


®L 

Cut:Ai0A2 


r'l-^ 


Ai ,  A2,  A  h  B 


r"  h  A2 


r',A2,AhB 


Cut-.Ai 


r",  r',  A  h  5 


Cut:A2 


The  following  property  of  the  rewrite  rules  is  not  difficult  to  check: 


Proposition  1  (Soundness  of  Rewrite  Rules)  The  above  rewrite  rules  preserve  the  T - 
equivalence  of  proofs.  | 


We  now  show  that  the  these  reduction  rules  are  strongly  normalizing.  We  will  need  the  following 
definition  in  the  proof  of  the  strong  normalization  theorem. 


Definition  2  The  grade  g  of  a  formula  A  is  the  number  of  occurrences  of  0  contained  in  A.  The 
grade  of  an  inessential  cut  is  the  grade  of  its  cut  formula. 


Thus,  by  the  definition  above,  grade  of  an  essential  cut  is  0. 

Theorem  2  (Strong  Normalization)  There  is  no  infinite  reduction  sequence  beginning  with  any 
proof  V. 

Proof:  Let  the  complexity  of  a  proof  be  a  pair  (a,b),  where 

•  a  —  sum  of  the  grade  g  of  cut  formulas  of  aU  inessential  cuts  in  the  proof. 

•  b  =  sum  of  the  nodes  above  all  inessential  cuts  (including  the  premisses  and  conclusion  of  the 
cut). 

Clearly,  a  cut-reduced  proof  has  complexity  (0,0).  We  now  show  that  each  step  of  reduction  on 
a  proof  reduces  its  complexity.  Consider  the  three  classes  of  the  transformations  above.  It  is  easy 
to  see  that  application  of  these  transformations  in  each  case  to  a  proof  reduces  its  complexity. 

Axiom:  Both  a  and  b  are  reduced. 


Permutation:  b  is  reduced  keeping  a  the  same. 
Logical:  a  is  reduced. 

Thus,  aU  reduction  sequences  terminate.  | 


In  the  following  section  we  show  that  the  induced  reduction  relation  on  the  equivalence  classes 
modulo  the  relation  5(77)  on  proofs  enjoys  the  Church- Rosser  property.  We  will  then  show  that  ev¬ 
ery  T -  equivalence  class  has  a  unique  normal  process  representative  by  showing  that  the  equivalence 
defined  by  the  reduction  relation  on  5-equivalence  classes  coincides  with  the  relation  T.  The  5- 
equivalence  class  of  normal  forms  will  then  be  the  unique  process  representative  of  a  T-equivalence 
class. 
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4  Normal  process  representatives. 

Let  =>  be  the  reduction  relation  on  proofs  and  let  =>£  be  the  induced  reduction  relation  on  the 
equivalence  classes  of  proofs  modulo  the  equivalence  relation  S.  Our  aim  is  to  show  that  =>^5  is 
weakly  Church-Rosser. 

Theorem  3  (Weak  Church-Rosser)  The  relation  satisfies  the  weak  diamond  property,  that 


Proof:  (Sketch)  The  result  is  proved  by  induction  on  the  structure  of  the  proof  tree  by  analyzing 
the  various  cases  that  can  arise.  The  case  where  the  last  rule  used  is  a  0L  or  a  ®R  follows  from 
the  induction  hypothesis.  The  same  holds  when  the  last  rule  is  an  essential  cut  or  when  the  last 
rule  is  an  inessential  cut  and  a  reduction  cannot  be  applied  to  this  last  rule.  The  case  where  latter 
does  not  hold  is  the  only  interesting  case.  In  this  case,  a  permutation  rule  may  be  applied  in  two 
different  ways.  Analyzing  different  possibilities,  we  show  the  existence  of  S  above  by  applying 
further  reduction  steps  to  n'  and  n''.  For  example,  let  11  be  of  the  form 


Hi  n2  Ha 

T,B,ChA  A',A\-D  A"^E 

T,B®C\-A®  A',A",Ah  D  ®  E 
T,B®C,A',A"\-  D®E 


then  let  fl'  and  11"  be  obtained  by  application  of  the  permutation  rule  for  0L  and  0R,  respectively. 
That  is, 


Hi 

T,B,C\-  A 


Ha 

A',A\-  D 


A"  h  E 


A',  A",  A\-  D®  E 


r,B,C,A',A"  h  D®  E 
T,B  ®C,A',A''\-  D®  E 


Hi 

T,B,C\-A  U2 

r,5  0ChA  A',A\-D 

T,B®C,A'\-D 

T,B®C,A',A"\-  D 


A"^  E 
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Now  n'  can  be  reduced  to 


Hi  Hs 

T,B,C\-A  A',A\-D 

T,B,C,  A'hD 


A"\-  E 


T,B,C,A',A"\-  D®E 
T,B®C,A',A"^  D®  E 

by  another  application  of  a  permutation  rule  and  similarly  11"  can  be  reduced  to 

Hi  n2 

T,B,C\-A  A',A\-D 

^  -  T,B,C,A'hD  ns 

T,B®C,A'[-D 

T,B®C,A',A"\-  D®E  ® 

It  is  easy  to  see  that  S'*SE"  and  thus  the  required  existence  of  E  has  been  shown.  | 

Since  the  reduction  rules  are  strongly  normalizing  by  Theorem  2,  we  use  the  Newman’s  Lemma 
(see  [2]  on  page  58)  which  says  that  WCR  and  SN  implies  CR  to  conclude  that  satisfies  the 
following  diamond  property  which  will  be  used  in  the  proof  of  Theorem  4  below. 


Definition  3  A  normal  process  representative  is  an  «S-equivalence  class  of  normal  forms. 

Theorem  4  (Unique  Process  Representative)  Let  N  be  a  net.  In  every  T -equivalence  class, 
there  is  a  unique  normal  process  representative. 

Proof:  Let  H  and  H'  be  two  <S-equivalent  classes.  Define  H  IJ.  H'  if  they  both  reduce  to  same 
normal  form  modulo  the  equivalence  <S.  To  prove  the  theorem,  we  only  have  to  show  that  IT  JJ- 11'  iff 
n  Tn',j.e.  the  two  equivalences  coincide.  Since  the  only  if  part  follows  from  the  soundness  of  the 
rewrite  rules,  we  are  only  left  with  the  if  part.  To  prove  the  if  part,  we  show  that  if  two  proofs  are 
equivalent  by  virtue  of  equation  (7)  in  section  2,  then  there  is  a  sequence  of  reduction  from  one 
to  another.  We  thus  rewrite  the  left-hand  side  of  the  equation  (7)  to  a  form  which  is  5-equivalent 
to  the  right-hand  side  of  the  equation. 


n  n' 

Lh A  Ah  B 

T,Ah-  A®  B 


n" 

AhC 


n'" 

Bh  D 


A,B\-C®D 


B\-C 


T,A\-C®D 
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n"  n'" 
n  Ah-C  Bh  D 
n'  ri-yl  A,BhC^jD 

Af-^  r,Bhc®n 


r,AhC®D 

-  Cut 

n  n" 

r h  A  AhC  ^  n'" 

n'  Thc  5  h  D 

AhB  T,BhC®D 

^5 

T,  A  h  C  ®  D 

OUI 

n  n"  n'  n'" 

rhA  AhC  Ah  B  BhB 

rhc  r H 

r,AI-C(8)£' 

Thus  we  have  shown  the  existence  of  a  unique  normal  5-class  for  each  T-class.  | 

In  the  next  section  we  briefly  sketch  how  rewrites  on  proofs  can  be  viewed  as  (typed)  rewrite  on 
arrows  in  a  suitable  category.  A  detailed  analysis  of  this  relationship  wiU  be  discussed  elsewhere. 


5  A  note  on  arrows  vs.  proofs. 

As  we  mentioned  before,  some  authors  have  found  it  convenient  to  work  with  arrows  (in  strictly 
symmetric  strict  monoidal  categories  to  be  exact)  rather  than  proofs  as  we  have  done  in  this  paper. 
To  some  extent  this  is  a  matter  of  taste,  but  it  can  be  illuminating  to  see  things  in  both  ways.  For 
example,  the  rewrite  rules  for  proofs  in  section  3  are  translated  respectively  as  the  following  rewrite 
on  arrows. 

1. 

■A-*A  .  J-A^B  ^  jA-*B 

2. 

®  iD®E^D®E^  .^gD®A-^B  ^  ^  ®  )  0 

3. 

These  may  at  first  sight  seem  rather  unwieldy,  but  our  proof-theoretic  results  show  that  they 
win  work.  Moreover,  we  found  that  working  with  proofs  helped  us  to  get  the  right  definition  of 
normal  form.  Concerning  the  rewrite  system,  the  proof  of  Theorem  4  suggests  that  if  one  is  given 
associativity  and  commutativity  of  arrows  for  free,  the  following  rewrite  wiU  work  whenever  the 
right-hand  side  is  defined. 


(/i  ®  12)  ;(pi  ®  52)  =>  (/i  ;9i)  ®  (/2  ;P2) 
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In  other  words,  this  rewrite  rule  will  give  a  unique  “normal”  S[N]  arrow  for  each  T[N]  equivalent 
class  of  arrows  of  [5].  In  the  rewrite  above,  the  left-hand  side  is  always  defined  whenever  the  right- 
hand  side  is  defined  but  not  vice-versa.  In  particular,  subject  reduction  fails  drastically,  so  the 
rewrite  system  must  maintain  the  types  of  the  terms. 
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Abstract 

We  develop  formal  methods  for  reasoning  about  memory  usage  at  a  level  of  abstraction 
suitable  for  establishing  or  refuting  claims  about  the  potential  applications  of  linear  logic  for 
static  analysis.  In  particular,  we  demonstrate  a  precise  relationship  between  type  correctness  for 
a  language  based  on  linear  logic  and  the  correctness  of  a  reference-counting  interpretation  of  the 
primitives  that  the  language  draws  from  the  rules  for  the  ‘of  course’  operation.  Our  semantics 
is  ‘low-level’  enough  to  express  sharing  and  copying  while  still  being  ‘high-level’  enough  to 
abstract  away  from  details  of  memory  layout.  This  enables  the  formulation  and  proof  of  a  result 
describing  the  possible  run-time  reference  counts  of  values  of  linear  type. 
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1  Introduction 

There  have  been  a  variety  of  efforts  to  exploit  ideas  from  linear  logic  for  the  design  and  analysis  of 
programming  languages.  It  is  our  contention  that  a  perspective  on  these  proposals  can  be  found  in 
the  view  that  hnear  logic  is  a  tool  for  analyzing  a  structure  we  call  a  memory  graph  which  is  used 
to  represent  the  run-time  data  of  a  program.  A  memory  graph  is  simply  a  directed  graph  together 
with  a  finite  set  of  functions  that  map  finite  sets  called  roots  into  the  nodes  of  the  graph.  It  is  a 
mathematical  abstraction  of  the  run-time  structure  that  holds  such  data  as  the  activation  records 
of  procedures,  heap-allocated  objects,  and  so  on.  We  argue  a  programming  language  based  on 
linear  logic  yields  fine-grained  information  about  how  the  memory  graph  evolves  at  run-time,  thus 
providing  information  that  could  be  exploited  in  program  analysis.  In  particular,  the  information 
provided  by  the  linear  connectives  concerns  the  reference  counts  of  nodes  in  the  memory  graph, 
where  the  reference  count  of  a  node  is  the  sum  of  the  in-degree  of  the  node  in  the  graph  and 
the  number  of  root  elements  mapped  to  it.  Reference  counting  has  a  long  [C0I6O,  DB76],  albeit 
controversial  [WHH092,  Bak88,  App92],  history  as  a  technique  for  avoiding  garbage  collection. 
But,  aside  from  its  direct  use  in  managing  memory,  reference  counting  can  offer  a  unifying  view  of 
many  code  optimizations,  and  various  code  generation  strategies  can  be  seen  as  attempts  to  control 
reference  counts  so  that  optimizations  can  be  performed.  For  example,  the  correctness  of  in-place 
updating  relies  on  ensuring  that  the  reference  count  of  an  object  is  one. 

Attempts  to  study  programming  languages  like  the  one  in  this  paper  fall  roughly  into  two 
groups.  There  are  those  that  use  some  analog  of  the  Curry-Howard  correspondence  [How80]  as 
the  basis  for  the  design  of  a  language  based  on  linear  logic  [Abr,  H0I88,  Laf88,  LM92,  Mac91],  and 
those  that  consider  systems  similar  to  linear  logic  (hereafter  called  ‘LL’)  for  specific  applications  (for 
instance,  [GH90]  and  [Wad91b]  consider  systems  to  detect  single-threading).  The  system  presented 
in  this  paper  falls  into  the  former  category,  except  to  the  extent  that  we  have  added  some  additional 
constructs,  such  as  recursive  functions,  that  bring  us  closer  to  traditional  functional  programming 
languages. 

To  convey  some  of  the  spirit  of  the  ideas  and  constructs  discussed  in  the  literature  just  men¬ 
tioned,  let  us  look  at  a  concrete  example.  Consider  the  program  on  the  left  side  of  Table  1.  The 
code  implements  the  addition  function  in  terms  of  functions  for  incrementing  and  decrementing. 
The  syntax  is  that  of  SML  using  a  set  of  familiar  computational  primitives  such  as  recursive  defini¬ 
tion,  branching  conditional,  and  local  definition.  Looking  closely  at  the  definition  of  addition,  it  is 
possible  to  note  a  difference  between  how  the  formal  parameters  of  add,  the  variables  x  and  y,  are 
used  in  the  body  of  the  definition.  The  value  of  x  is  needed  in  the  test  of  the  conditional,  which 
is  always  evaluated,  and  in  the  else  branch  of  the  conditional,  which  may  not  be  evaluated,  but 
not  in  the  then  branch  of  the  conditional.  On  the  other  hand,  the  variable  y  is  needed  regardless 
of  whether  the  then  or  else  branch  of  the  conditional  in  the  body  is  taken.  In  particular,  its  value 
is  needed  only  once,  not  twice — as  may  be  the  case  with  x.  This  brings  out  two  aspects  of  the 
difference  between  x  and  y:  first,  that  the  two  variables  may  be  used  a  different  number  of  times  (y 
exactly  once  and  x  either  once  or  twice)  and,  second,  that  the  value  of  x  must  be  shared  between 
its  two  separate  uses. 

The  program  on  the  right  is  a  version  of  the  addition  function  written  in  a  program  with 
‘linear  logic  annotations’  (using  a  slight  simplifcation  of  the  notation  that  we  wiU  define  precisely 
later).  There  are  four  new  primitives  used  here:  shcire,  dispose,  store,  and  fetch.  The  share 
primitive  indicates  that  x  is  needed  twice:  the  first  use  is  bound  to  the  variable  w  and  the  second 
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Table  1:  Translating  to  a  Linear- Logic-Based  Language. 


let  fun  add  x  y  = 
if  X  =  0 
then  y 

else  add  (x-1)  (y+1) 
in  add  2  1 
end 


let  fun  add  x  y  = 

share  w,z  as  x  in 
if  fetch  w  =  0 

then  dispose  z,  add  before  y 
else  (fetch  add) 

(store  ((fetch  z)-l)) 

(y+1) 

in  add  (store  2)  1 
end 


to  the  variable  z.  These  two  variables  share  the  value  to  which  x  is  bound.  The  dispose  primitive 
indicates  that  one  of  these  sharing  variables,  z,  is  not  used  in  the  first  branch  of  the  conditional.  The 
primitive  store  creates  a  sharable  value  and  fetch  obtains  a  shared  value.  In  our  interpretation, 
the  LL-specific  operations  share  and  dispose  explicitly  manage  reference  counts  of  the  share’able 
and  dispose’able  objects  that  are  created  and  consulted  by  being  store’d  and  fetch’ed.  For  the 
example  in  Table  1,  the  occurence  of  share  indicates  that  two  pointers  are  needed  for  the  value 
associated  with  x  (so  the  reference  count  of  the  associated  value  is  incremented),  but  in  the  then 
branch  of  the  conditional,  one  of  the  pointers  is  no  longer  needed  (so  the  reference  count  of  the 
associated  value  is  decremented). 

Analogs  to  the  store  and  fetch  operations  are  the  delay  and  force  operations  that  appear  in 
many  functional  programming  languages.  In  such  languages,  the  delay  primitive  postpones  the 
evaluation  of  a  term  until  it  is  supplied  to  the  force  primitive  as  an  argument.  When  this  happens, 
the  value  of  the  delayed  term  is  computed,  returned,  and  memoized  for  any  other  applications  of 
force.  Abramsky  [Abr]  has  argued  that  this  is  a  natural  way  to  view  the  operational  semantics  of 
the  store  and  fetch  operations  of  LL;  we  wiU  foUow  this  approach  as  well.  The  dispose  primitive 
has  an  analog  (and  namesake)  in  several  programming  languages.  Typically,  an  object  is  disposed 
by  being  deallocated;  this  operation  is  unsafe  because  it  can  lead  to  dangling  pointers.  In  our  LL 
language  the  primitive  dispose  will  only  deallocate  memory  if  this  is  safe  since  its  semantics  wiU  be 
to  decrement  a  reference  count;  deallocation  only  happens  when  this  count  falls  to  zero.  The  shcire 
command  is  unique  to  LL,  and  its  name  accurately  reflects  the  way  in  which  it  will  be  interpreted. 

One  of  our  primary  goals  in  this  paper  is  to  offer  an  approach  for  rigorously  expressing  and 
proving  optimizations  obtained  by  analyzing  an  LL-based  language.  In  particular,  there  is  an 
adage  that  ‘linear  values  have  only  one  pointer  to  them’  or  ‘linear  values  can  be  updated  in  place’. 
Wadler  [Wad90]  has  informally  observed  that  these  claims  must  be  stated  with  some  care:  a 
reference  count  of  one  can  be  maintained  by  copying,  but  this  would  negate  the  advantage  of  in- 
place  updating.  Our  operational  semantics  allows  us  to  check  the  claim  rigorously:  in  particular, 
we  show  that  linear  variables  may  fail  to  have  a  count  of  one  in  our  reference-counting  operational 
semantics,  which  uses  sharing  heavily;  when  this  is  the  case,  a  linear  variable  does  not  have  a 
unique  pointer  to  it  and  cannot  safely  be  updated  in  place.  The  problem  arises  when  a  linear 
variable  falls  within  the  scope  of  an  abstraction  over  a  non-linear  variable.  We  express  a  theorem 
asserting  precisely  when  the  value  of  a  linear  variable  does  indeed  maintain  a  reference  count  of  at 
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most  one. 

A  broader  theme  of  our  investigation  is  developing  a  level  of  abstraction  in  the  semantics  of 
programming  languages  that  permits  ‘low-level’  concepts  to  be  formabzed  in  a  clear  and  relevant 
way.  There  has  been  significant  progress  in  formulating  theorems  about  programming  languages 
and  memory  ([GG92]  and  [W092]  are  recent  examples  treating  garbage  collection  and  run-time 
storage  representation  respectively).  It  is  our  hope  that  we  can  contribute  to  a  foundation  for 
further  advances  in  this  direction. 

We  will  be  concerned  only  with  the  question  of  a  computational  interpretation  of  intuitionistic 
bnear  logic,  the  fragment  of  the  language  without  negation  and  the  ‘par’  operation.  In  fact,  we  will 
restrict  ourselves  to  the  language  obtained  from  the  bnear  impbcation  (5  -o  t)  and  ‘of  course’  (!s) 
operations,  although  our  results  can  be  extended  to  all  of  intuitionistic  LL.  For  the  rest  of  the  paper, 
read  ‘bnear  logic’  to  mean  the  impbcational  fragment  of  intuitionistic  bnear  logic.  We  present  our 
language  and  its  properties  in  stages.  The  second  section  of  the  paper  discusses  the  operational 
semantics  of  memoization  with  the  aim  of  putting  in  place  the  basic  notation  and  approach  that 
wib  be  used  in  subsequent  sections.  The  third  section  describes  the  syntax,  typing  rules,  and 
‘high-level’  operational  semantics  for  our  LL-based  language.  The  fourth  section  describes  the 
‘low-level’  operational  semantics  of  the  language.  The  invariants  that  express  the  basic  properties 
of  the  memory  graph  in  this  semantics  are  precisely  expressed  and  proved.  The  fifth  section  of  the 
paper  demonstrates  further  basic  properties  of  this  semantics,  including  its  correspondence  to  the 
high-level  semantics  and  its  independence  from  the  scheme  used  to  aUocate  new  memory.  The  sixth 
section  uses  the  operational  semantics  to  prove  a  static  condition  under  which  a  bnear  value  wib 
always  have  a  reference  count  of  one;  this  shows  that  the  LL-based  language  is  indeed  amenable  to 
analysis  about  memory  usage.  The  seventh  section  discusses  various  aspects  of  the  technical  results 
of  the  paper  and  attempts  to  provide  additional  perspective.  Some  of  the  most  technical  proofs 
have  been  deferred  to  an  appendix. 


Operational  Semantics  with  Memory 


5 


2  Operational  Semantics  with  Memory 

Here  we  give  a  preview  of  the  operational  semantics  of  the  LL-based  language  by  describing  the 
familiar  operational  semantics  of  a  simple  functional  language  with  store  (delay)  and  fetch  (force) 
operations.  We  base  this  preliminary  discussion  on  a  language  with  the  grammar 

M  ;:=  x\(Xx.M)\{M  M)\ 

n  I  true  |  false  [  (succ  M)  \  (pred  M)  |  (zero?  M)  | 

(if  M  then  M  else  M)  \  (fix  M)  \ 

(store  M)  |  (fetch  M) 

where  x  and  n  are  from  primitive  syntax  classes  of  variables  and  numerals  respectively.  This  is  a 
variant  of  PCF  [Sco,  Plo77,  BGS90]  augmented  by  primitive  operations  for  forcing  and  delaying 
evaluations.  The  expression  (fix  M)  is  used  for  recursive  definitions. 

The  key  to  providing  a  semantics  for  this  language  is  to  represent  the  memoization  used  in 
computing  the  fetch  primitive  so  that  certain  recomputation  is  avioded.  We  aim  to  provide  a 
semantics  at  a  fairly  high  level  of  abstraction  using  what  is  sometimes  known  as  a  natural  seman¬ 
tics  [Des86,  Kah87].  Such  a  semantics  has  been  described  in  [PS91]  using  explicit  substitution  and 
in  [Lau93]  through  the  use  of  an  intermediate  representation  in  which  all  function  applications  have 
variables  as  arguments.  Both  of  these  approaches  are  appealingly  simple  but  slightly  more  abstract 
than  we  would  like  for  our  purposes  in  this  paper.  Our  own  approach,  first  described  in  [CGR92], 
is  based  on  a  distinction  between  an  environment  which  is  an  association  of  variables  with  loca¬ 
tions  and  a  store  which  is  an  association  of  values  with  locations.  Sharing  of  computation  results 
is  achieved  through  creating  multiple  references  to  a  location  that  holds  a  delayed  computation 
called  a  thunk.  When  the  value  delayed  in  the  thunk  is  needed,  it  is  calculated  and  memoized  for 
future  reference.  To  define  this  precisely  we  must  begin  with  some  notation  and  basic  operations 
for  environments,  stores,  and  memory  allocation. 

Fix  an  infinite  set  of  locations  Loc,  with  the  letter  I  denoting  elements  of  this  set.  Let  us  say 
that  a  partial  function  is  finite  just  in  case  its  domain  of  definition  is  finite. 

•  An  environment  is  a  finite  partial  function  from  variables  to  locations;  p  denotes  an  envi¬ 
ronment,  and  Env  denotes  the  set  of  all  environments.  The  notation  p{x)  returns  the  location 
associated  with  variable  x  in  p,  and  to  update  an  environment,  we  use  the  notation 

w>’“'i)w={'(,)  !(.L7wL. 

The  symbol  0  denotes  the  empty  environment;  we  also  use  [x  i-*-  Z]  as  shorthand  for  0[x  /]. 

•  A  value  is  a 

—  numeral  k, 

—  boolean  b, 

—  pointer  susp(/)  or  rec(Z,/),  or 

—  closure  closure(Ax.  M,  p)  or  recclosure(Ax.  M,p). 

The  letter  V  denotes  a  value,  and  Value  denotes  the  set  of  values. 
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•  A  storable  object  is  either  a  value  or  a  thunk  thunk(M,  p).  We  use  Storable  to  denote  the 
set  of  storable  objects. 

•  A  store  is  a  finite  partial  function  a  from  hoc  to  Storable.  The  symbol  a  denotes  a  store,  0 
denotes  the  empty  store,  and  Store  denotes  the  set  of  stores.  We  will  use  the  same  notation 
to  update  stores  as  for  updating  environments. 

Given  a  store  a  and  a  location  /,  we  define  a[l  5]  to  be  the  store  obtained  by  updating  a  by 
binding  I  to  the  storable  object  S.  We  also  need  a  relation  for  allocating  memory  cells.  A  subset' J? 
of  the  product  (Storable  X  Store)  X  (Loc  X  Store)  is  an  allocation  relation  if,  for  any  store  a  and 
storable  object  S,  there  is  an  I'  and  a'  where  {S,a)  R  (/^cr')  and 

•  V  ^  dom((T)  and  dom((7')  =  dom(c7)  U  {/'}; 

•  for  aU  locations  I  £  dom(cr),  a{l)  =  (y'{l)]  and 

•  a'iV)  =  S. 

This  definition  abstracts  away  from  the  issue  of  exactly  how  new  locations  are  found.  For  specificity, 
we  choose  an  allocation  relation  new  that  is  a  function,  and  write  new(5,  cj)  for  the  pair  (/',  a')  such 
that  (5,  cr)  new  Of  course,  our  operational  semantics  should  be  independent  of  the  choice 

of  allocation  relation,  a  point  we  wiU  formalize  after  describing  the  semantics  of  our  LL-based 
language  below. 

The  operational  rules  for  our  language  could  be  given  using  a  natural  semantics  with  rules  of 
the  form  (M,  p,  cr)  Jj-  (/,  a')  where  the  domain  of  p  contains  the  set  of  free  variables  of  M,  and  I  is  a 
location  in  the  domain  of  a'  that  holds  the  result  of  evaluation.  Writing  the  semantics  in  the  form 
of  rules  {e.g.,  as  in  the  appendix  of  [CGR92])  becomes  somewhat  cumbersome,  so  we  use  a  kind  of 
primitive  pseudo-code  that  can  readily  be  translated  into  a  natural  semantics.  As  a  first  example, 
consider  how  the  store  primitive  is  evaluated: 

memiiiterp( (store  M),  p,  cr)  = 

let  (/o,  (To)  =  new(thunk(M,  p),  a) 
in  new(susp(/o),  ctq) 

Read  this  as  follows.  To  evaluate  (store  M)  in  the  environment  p  and  store  cr,  first  allocate  a  new 
location  holding  a  thunk  composed  of  M  and  the  environment  p.  Let  ctq  be  the  new  store  and  /q 
be  the  location  in  which  the  thunk  is  held.  The  result  of  the  evaluation  is  a  store  obtained  from 
cTo  by  allocating  a  new  location  holding  the  storable  value  susp(/o)  paired  with  this  new  location. 
Note,  in  particular,  that  M  is  not  evaluated.  The  structure  that  has  been  added  to  the  memory  is 
depicted  in  Figure  1. 

The  interesting  part  of  the  evaluator  and  the  essence  of  memoization  is  given  by  the  way  in 
which  the  fetch  primitive  is  handled.  The  argument  of  fetch  is  evaluated  to  return  a  storable  value  of 
the  form  susp(/i).  The  content  of  location  h  is  then  examined  to  determine  whether  the  suspension 
has  been  evaluated  to  a  value  or  whether  it  has  not  yet  been  evaluated,  in  which  case  it  has  the 
form  thunk(A',  p).  If  the  content  is  a  value,  a  pointer  to  the  value  is  returned,  otherwise  the  thunk  is 
evaluated  and  /i  duly  updated  with  its  value.  A  pointer  to  the  value  of  the  thunk  is  then  returned 
as  the  result.  Here  is  the  pseudo-code  description: 
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susp(i 

1) 

thunk(M,  p) 

Figure  1:  Structure  generated  by  (store  M) 


meininterp( (fetch  M),  p,  cr)  = 

let  (/o,  (To)  =  meminterp(M,  p,  cr) 
in  case  o-q(/o)  of  susp(/i)  => 
case  (To(/i) 

of  thunk(A'^,  p')  => 

let  (I2,  o-i)  =  meminterp(A/’,  p',  (Tq) 
in  {I2,  <Ti[/o  susp(/2)]) 

I  -  =>  (/i,  <ro) 

Note  that  there  is  no  clause  for  the  case  when  o'o(/o)  is  not  a  suspension.  In  this  case,  we  assume 
that  the  behavior  of  the  interpreter  on  (fetch  M)  is  undefined.  This  assumption  simplifies  the  rules, 
and  allows  us  to  ignore  what  are,  in  effect,  run-time  type  errors.  Our  other  rules  wiU  also  ignore 
run-time  type  errors. 

There  is  another  approach  we  might  have  taken  to  modelling  memoization.  The  interpretation 
of  (store  M)  allocates  a  location  Iq  that  holds  a  thunk,  and  returns  a  location  lx  that  holds  a  pointer 
susp(/o)  to  this  location.  Could  we  instead  have  returned  Iq  as  the  value?  That  is,  the  rule  could 
read 

meminterp'( (store  M),  p,  cr)  =  new(thunk(M,  p),  cr) 

The  answer  to  this  question  is  instructive,  since  it  relates  to  the  way  in  which  we  will  represent  the 
distinction  between  copying  and  sharing  in  our  model.  If  we  choose  to  return  the  location  holding 
the  thunk  as  the  value  of  the  store  (as  opposed  to  returning  a  location  holding  the  pointer  to  this 
thunk),  then  this  would  require  a  change  in  the  fetch  command.  In  particular,  when  the  location 
I2  is  obtained  there,  it  would  be  essential  to  put  the  value  cr(/2)  in  the  location  where  the  value  of 
the  thunk  may  be  sought  later: 

meminterp'( (fetch  M),  p,  a)  = 

let  (/q,  (To)  =  meminterp'(M,  pi,  cr) 
in  case  cro(/o) 

of  thunk(N,  p')  => 

let  {I2,  (Ti)  =  meminterp'(iV,  p' ,  ao) 
in  (/q,  cri[/o  cri(/2)]) 

I  -  =>  (io,  o-o) 

Note  that  in  the  second  line  from  the  bottom  of  the  program,  the  values  of  Iq  and  I2  in  the  store 

are  the  same  and  we  will  say  that  the  value  of  the  thunk  has  been  copied  from  location  I2  to  Iq.  In 

the  case  that  crx{l2)  is  a  ‘small’  value,  like  an  integer  that  occupies  only  a  word  of  storage,  there  is 
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little  difference  between  copying  the  value  from  I2  to  Iq  versus  returning  a  pointer  to  I2  as  we  did  in 
the  earlier  implementation.  If  the  value  criih)  is  ‘large’,  however,  then  copying  may  be  expensive. 
In  the  language  we  are  considering,  this  might  involve  copying  a  closure,  which  would  be  a  modest 
expense,  but  in  a  fuller  language  it  might  involve  copying  a  string  or  functional  array,  which  could 
be  very  expensive.  (If  cri(/2)  is  a  mutable  value,  then  the  copying  is  probably  incorrect — but  this 
is  not  a  problem  for  the  functional  language  at  hand.)  Our  semantics  does  not  directly  represent 
the  cost  associated  with  copying  because  it  abstracts  away  from  a  measure  of  the  size  of  a  value; 
instead,  we  wiU  treat  copying  as  if  it  is  something  to  be  avoided  in  favor  of  sharing  (indirection) 
whenever  this  is  feasible.  This  suggests  yet  a  third  approach  to  the  semantics  of  fetch  where  store 
is  implemented  as  with  meminterp'  but  where  the  interpretation  of  fetch  uses  an  indirection  for  the 
returned  value: 

memiiiterp"((fetch  M),  p,  cr)  — 

let  (/q,  (Tq)  =  meminterp"(M,  pi,  a) 
in  case  (ro(/o) 

of  thunk(A^,  p')  => 

let  [In,  (Ti)  =  iiieminterp"(A/’,  p' ,  cro) 
in  (/q,  o"! [/q  I— ^  @12]) 

I  _  =>  Co,  «ro) 

where  @l2  is  to  be  viewed  as  a  boxed  value.  This  is  possibly  closer  to  the  way  memoization  would 
be  implemented  in  most  compilers.  For  the  semantics  we  use  for  the  LL-based  language  in  the  next 
section  this  approach  complicates  the  semantics  slightly  and  is  less  efficient  because  of  the  way  the 
reference  counting  is  done.  Otherwise,  our  approach  could  accomodate  this  alternative  without 
major  changes. 

The  implementation  of  memoization  involves  the  idea  of  mutating  a  store.  Even  the  ‘functional’ 
parts  of  the  language  must  respect  the  potential  side  effects  to  the  store  that  memoization  may 
cause.  Hence  these  operations  must  pass  the  store  along  in  an  appropriate  manner.  Doing  this 
correctly  may  save  recomputation.  Here,  for  instance,  is  how  the  application  operation  is  described: 

meminterp((M  TV),  p,  a)  = 

let  (/o,  (To)  =  ineminterp(M,  p,  cr) 

(^1)  (Ti)  =  ineniinterp(A'’,  p,  ctq) 
in  case  cri(/o)  of  closure(Ax.  p')  => 
meminterp(A^,  p'[j:  1— *•  /i],  ai) 

The  store  resulting  from  evaluating  M  is  used  in  evaluating  N]  similarly,  the  store  resulting  from 
evaluating  N  is  used  in  evaluating  the  application. 

There  are  a  variety  of  ways  to  implement  recursion.  A  reasonably  efficient  approach  is  to  create 
a  circular  structure.  This  approach  is  simplified  by  restricting  the  interpreter  to  programs  such 
that,  in  constructs  of  the  form  (fix  N),  the  term  N  has  the  form  A/.  Xx.M.  The  restriction  is 
not  necessary,  but  it  is  typical  for  caU-by-value  programming  languages.  The  semantics  for  such 
recursions  is  given  by 

meminterp((fix  A/.  Xx.  M),  p,  a)  — 
let  (/o,  (To)  =  new(0,  a) 

(/i,  (Ti)  =  new(recclosure(A2;.  M,  p[f  ^  /o]),  <^o) 
in  Co,  (Ti[/o  1-^  rec(/i,  /)]) 
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rec(/ 

iJ) 

_ _ _ _ 

i 

recclosure(Aa;.  M,p[f  i-)-  1 

o]) 

Figure  2:  Structure  generated  by  (fix  A/.  Xx.  M) 


which  creates  the  circular  structure  in  Figure  2.  For  this  language  we  could  create  a  single  cell 
holding  the  recclosure  that  looped  back  to  itself;  we  use  two  cells,  though,  since  the  additional 
cell  holding  rec  will  be  used  in  the  semantics  of  the  LL-based  language  to  facilitate  connections 
with  the  type  system.  We  also  need  here  to  change  the  semantics  of  applications  so  that  if  the 
operator  evaluates  to  a  rec,  the  pointer  is  traced  to  a  recclosure;  in  turn,  if  the  operator  evaluates 
to  a  recclosure,  the  operator  is  used  in  the  same  way  as  a  closure. 

In  the  implementation  of  actual  functional  programming  languages,  a  single  recursion  such  as 
the  one  above  would  probably  make  its  recursive  calls  through  a  jump  instruction.  This  would 
be  quite  difficult  to  formalize  with  the  source-code-based  approach  we  are  using  to  describe  the 
interpreter.  The  important  thing,  for  our  purposes,  is  that  recursive  calls  to  /  do  not  allocate 
further  memory  for  the  recursive  closure.  This  means  that,  as  far  as  memory  is  concerned,  there 
is  little  difference  between  implementing  the  recursion  with  the  jump  and  implementing  it  with  a 
circular  structure.  The  cycle  created  in  this  way  introduces  extra  complexity  into  the  structure  of 
memory,  of  course,  but  the  cycles  introduced  in  this  way  must  have  precisely  the  form  pictured  in 
Figure  2. 

It  is  easy  to  provide  a  clean  type  system  for  the  language  described  above.  One  technical 
convenience  is  to  tag  certain  bindings  with  types  (such  as  the  binding  occurence  in  an  abstraction 
Xx  :  s.  M)  to  ensure  that  a  given  program  has  a  unique  type  derivation.  When  it  is  not  important  for 
the  discussion  at  hand,  we  will  often  drop  the  tags  on  bound  variables  to  reduce  clutter.  The  types 
for  the  language  include  ground  types  Nat  and  Bool  for  numbers  and  booleans  respectively,  higher 
types  (s  — >•  t)  for  functions  between  s  and  t,  and  a  unary  operation  !s  for  the  delayed  programs  of 
type  s.  The  typing  rules  for  store  and  fetch  are  introduction  and  elimination  operations  respectively: 

M  :  s  M  :  Is 

(store  M)  :  Is  (fetch  M)  :  s 

These  operations  will  also  be  found  in  our  LL-based  language  with  essentially  the  same  types. 
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3  A  Programming  Language  Based  on  Linear  Logic 

Term  assignment  for  linear  logic. 

If  a  programming  language  L  is  to  be  based  on  LL,  it  seems  reasonable  to  attempt  the  completion 
of  an  analogy  based  on  the  Curry-Howard  correspondence:  intuitionistic  logic  is  to  traditional 
functional  programming  languages  (such  as  ML  or  HaskeU)  as  LL  is  to  L.  Basing  a  language  on 
the  Curry-Howard  correspondence  for  LL  immediately  becomes  problematic,  as  LL  was  originally 
described  by  Girard  [Gir87]  using  a  sequent  calculus.  Most  programming  languages  have  a  syntax 
and  typing  system  Uke  the  natural  deduction  (hereafter  called  ‘ND’)  formulation  of  intuitionistic 
logic  rather  than  its  sequent  calculus  formulation,  since  type-checking  algorithms  are  easier  to 
describe  for  ND  formulations.  Progress  on  an  ND  form  for  intuitionistic  LL  has  been  gradual,  in 
part  because  substitutivity  fails  for  the  obvious  formulations: 

Definition  1  A  type  system  satisfies  the  substitutivity  property  if  well-typed  programs  are  closed 
under  substitution,  i.e.,  if  L  h  M  :  t  and  A,  x  :t\-  N  :  u  and  all  variables  in  P  and  A  are  distinct, 
then  r,  A  1-  N[x  :=  M]  :  u. 

Here  M[x  iV]  denotes  substitution  of  N  for  a:  in  M  with  the  bound  variables  of  M  renamed 
to  avoid  capture  of  the  free  variables  of  N.  SML  [MTH90,  MT91]  is  a  prototypical  example  of  a 
language  based  on  an  ND  presentation  that  satisfies  the  substitutivity  property. 

Merely  coming  up  with  a  ND  presentation  of  LL  that  satisfies  substitutivity  has  been  an  out¬ 
standing  problem.  In  the  absence  of  such  a  system,  Lincoln  and  MitcheU  [LM92],  Mackie  [Mac91], 
Wadler  [Wad91a],  and  the  authors  of  this  paper  in  a  preceeding  work  [CGR92]  employed  approaches 
that  obtain  some  of  the  virtues  of  an  ND  system  for  LL.  The  system  used  in  this  paper  is  based 
on  a  proposal  of  Benton,  Bierman,  de  Paiva,  and  Hyland  [BBdH92]  that  does  satisfy  the  substi¬ 
tutivity  property,  even  though  it  lacks  some  of  the  desirable  properties  of  the  ND  presentation  of 
intuitionistic  logic  (such  as  freedom  from  the  need  to  use  commuting  conversions  [GLT89]).  We 
refer  the  reader  to  their  paper  for  a  fuller  discussion. 

The  propositions  of  the  fragment  of  linear  logic  we  consider  are  given  by  the  grammar 

s  a  I  (s  -o  s)  I  !s 

where  a  ranges  over  atomic  propositions.  The  proofs  of  linear  propositions  are  encoded  by  terms 
in  the  grammar 


M  ::=  x\{Xx  ■.s.M)\{M  M)\ 

(store  M  where  Xi  =  Mj, . . . ,  =  Af„)  |  (fetch  M)  \ 

(share  x,y  as  M  in  M)  |  (dispose  M  before  M). 

Our  notation  here  essentially  corresponds  to  that  in  [CGR92,  LM92]  modulo  incorporating  adjust¬ 
ments  from  [BBdH92].  The  store  operation, 

(store  M  where  Xi  =  Mi , . . . ,  x„  =  M„ ) , 

binds  the  variables  Xi, . . . ,  x„  in  the  expression  M  and  the  share  operation 


(share  x,y  as  M  in  N) 
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Table  2:  Natural  deduction  rules  and  term  assignment  for  linear  logic. 


X  :  s  h  X  :  s 


T,x  :  s  h  M  :t 


T  h  (Aa:  :  5.  M)  :  (s  -o  t) 
r  t-  M  :  !s  A  h  N  :t 


T  h  M  :(s^t)  A  h  N  :s 
r,  A  h  (M  N)  :  t 

r  h  M  :  !5  A,x  :  ls,y  :  Is  N  :  t 


r,A  h  (dispose  M  before  iV)  :  t  r,A  h  (share  a:,  j/ as  M  in  vV)  :  t 

Ti  f-  Ml  :  Isi  .  ■  ■  Tn  i-  Mn  :  !.Sn  Xi  :  Isi, . .  .,Xn  :  Isn  i-  N  :  t 
ri,...,r„  h  (store  iV  where  a;i  =  Ml, ..  .,a:„  =  M„)  :  !t 

r  h  M  :  !s 
r  h  (fetch  M):s 


binds  the  variables  x  and  y  in  N.  The  notation  for  store  can  be  somewhat  unwieldy  when  writing 
programs,  but  most  programs  involving  store  bind  the  variables  in  the  where  clause  to  other  vari¬ 
ables.  Thus,  if  the  free  variables  of  M  are  Xi, . . . ,  x„,  then  (store  M)  is  shorthand  for  the  expression 
(store  M  where  Xi  =  Xj,. .  .,Xn  =  Xn)- 

The  typing  rules  for  the  language  appear  in  Table  2,  where  the  symbols  F  and  A  denote  type 
assignments,  which  are  lists  of  pairs  Xi  :  Si, . .  .,x„  :  s„,  where  each  Xj  is  a  distinct  variable  and 
each  Si  is  a  type.  Each  of  the  rules  is  built  on  the  assumption  that  aU  left-hand  sides  of  the  b 
symbol  are  legal  type  assignments,  e.g.,  in  the  ride  for  typing  applications,  the  type  assignments 
r  and  A,  which  appear  concatenated  together  in  the  conclusion  of  the  rule,  must  have  disjoint 
variables.  Each  type-checking  rule  corresponds  to  a  proof  rule  in  the  ND  presentation  of  linear 
logic.  For  instance,  the  rules  for  share  and  dispose  essentially  correspond  to  the  proof  rules  generally 
called  contraction  and  weakening  respectively,  while  those  for  store  and  fetch  correspond  to  the 
LL  rules  called  promotion  and  dereliction.  Due  to  the  presence  of  explicit  rules  for  weakening  and 
contraction — the  rules  for  type-checking  dispose  and  share — one  can  easily  see  that  the  free  variables 
of  a  weU-typed  term  are  exactly  those  contained  in  the  type  assignment.  A  particular  note  should 
be  taken  of  the  form  of  the  rule  for  store;  this  operation  puts  the  value  of  its  body  with  bindings 
for  its  free  variables  in  a  location  that  can  be  shared  by  different  terms  during  reduction — the  type 
changes  correspondingly  from  t  to  \t.  The  construct  (fetch  M)  corresponds  to  reading  the  stored 
value — the  type  changes  from  It  to  t. 

There  may  be  other  ND  presentations  of  LL  on  which  one  could  base  a  type  system.  It  is  our 
belief  that  results  in  this  paper  are  robust  with  respect  to  the  exact  choice  of  term  assignment  and 
type-checking  rules.  All  of  the  results  in  this  paper — including  negative  results  that  say  that  values 
of  linear  type  may  have  more  than  one  pointer  to  them — hold  in  the  system  described  in  [CGR92], 
and  we  expect  that  they  are  true  for  the  languages  described  in  [LM92]  and  [Mac91]. 
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Table  3:  Typing  rules  for  non-logical  constructs. 


h  n  :  Nat 


h  true,  false  :  Bool 


r  h  M  :  Nat 
r  h  (succ  M)  :  Nat 


r  b  M  :  Nat 
r  b  (pred  M)  :  Nat 


r  b  M  :  Nat  T  b  M  :  !(!s  s) 

r  b  (zero?  M)  :  Bool  F  b  (fix  M)  :  s 

r  b  T  :  Bool  A  \-  M  :s  A  N  :  s 
r,  A  b  (if  L  then  M  else  N)  :  s 


A  programming  language  based  on  linear  logic. 

To  fully  realize  the  ideas  of  LL  as  the  basis  for  a  programming  language,  it  is  essential  to  go 
beyond’  the  core  language.  First  of  all,  the  language  could  be  extended  to  one  that  includes  the 
linear  logic  connectives  for  pairing  and  sums,  namely  tensor  0,  plus  ©,  and  with  &.  Suitable  ND 
proof  rules  for  these  connectives  and  term  assignments  for  proofs  using  these  rules  are  described 
in  several  places  [Mac91,  LM92,  BBdH92].  A  more  challenging  question  is  how  to  extend  the 
language  to  include  constructs  for  which  the  use  of  the  Curry-Howard  correspondence  is  less  useful 
as  a  guide.  Examples  that  fall  in  this  category  are  arrays,  general  recursive  datatypes  involving 
linear  implication,  and  recursive  definitions  of  functions.  In  this  paper  we  treat  only  recursive 
function  definitions;  the  question  of  the  proper  treatment  of  recursive  definitions  in  an  LL-based 
language  is  likely  to  be  simpler  than  that  of  general  recursive  datatypes,  and  more  fundamental 
than  that  of  arrays. 

Our  language  is  essentially  a  synthesis  of  PCF  and  the  term  language  for  encoding  LL  natural 
deduction  proofs.  The  types  are  given  by  the  following  grammar: 

s  ::=  Nat  |  Bool  |  (5  -o  5)  |  !s 

Types  without  leading  !’s,  e.g.,  Nat  and  (Nat  -o  Bool),  are  called  linear  and  those  of  the  form  !s 
are  called  non-linear.  We  use  the  letters  5,  t,  u,  and  v  to  denote  types.  The  set  of  raw  terms  in 
the  language  is  given  by  the  grammar 

M  ::=  a:  I  (Ax  :  s.  M)  \{M  M)\ 

n  I  true  |  false  |  (succ  M)  \  (pred  M)  |  (zero?  M)  \  (if  M  then  M  else  M)  |  (fix  M)  \ 
(store  M  where  X\  =  M\ , . . . ,  x„  =  Mn)  \  (fetch  M)  | 

(share  x,y  as  M  in  M)  \  (dispose  M  before  M) 

where  the  letter  x  denotes  any  variable,  and  n  denotes  a  numeral  in  {0,1,2,,...}.  The  last  four 
operations  correspond  to  the  special  rules  of  linear  logic;  the  other  term  constructors  are  those  of 
PCF.  The  usual  definitions  of  free  and  bound  variables  for  PCF  also  apply  here  for  the  first  three 
lines  of  the  grammar. 
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The  typing  rules  for  our  language  are  given  by  combining  Tables  2  and  3.  Two  of  these  rules 
deserve  special  explanation.  First,  the  rule  for  checking  the  expression  if  L  then  M  else  N  checks 
both  branches  in  the  same  type  assignment,  i.e.,  the  terms  M  and  N  must  contain  the  same  free 
variables.  This  is  the  only  type-checking  rule  that  allows  variables  to  appear  multiple  times;  it  does 
not,  however,  violate  the  intuition  that  variables  are  used  once,  since  only  one  branch  will  be  taken 
during  the  execution  of  the  program.  Second,  the  slightly  mysterious  form  of  the  typing  rule  for 
recursions  is  related  to  the  idea  that  the  formal  parameter  of  a  recursive  definition  must  be  share’d 
and  dispose’d  if  there  is  to  be  anything  interesting  about  it.  Consider,  for  example,  the  rendering 
of  the  program  of  Table  1  into  our  language: 

(fix  (store  A  add  :  !(!Nat— ol\lat— oNat).  Xx  :  !Nat.  Xy  :  Nat. 
share  w,z  as  x  in 
if  zero?  (fetch  w) 

then  dispose  z  before  dispose  add  before  y 
else  (fetch  add)  (store  (pred  (fetch  z)))  (succ  y))) 

(store  2)  1 

(where  some  liberties  have  been  taken  in  dropping  a  few  of  the  parentheses  to  improve  readability). 
The  recursive  function  add  being  defined  gets  used  only  in  one  of  the  branches;  thus,  the  recursive 
call  must  have  a  non-linear  type. 

The  definition  of  the  addition  function  is  a  prototypical  example  of  how  one  programs  recursive 
functions  in  this  language.  In  fact,  both  the  high-level  and  low-level  semantics  wiU  only  interpret 
recursions  (fix  M)  where  M  has  the  form 

(store  (A/  :  Is  -o  t.  Xx  :  s.  M)  where  Xi  =  M\,. . . ,  =  Mn)- 

This  restriction  is  closely  connected  to  the  restriction  on  interpreting  recursion  mentioned  in  the 
previous  section;  the  only  difference  here  is  the  occurrence  of  the  store.  As  before,  this  restric¬ 
tion  is  not  essential,  but  it  does  simplify  the  semajitic  clause  for  the  recursion  somewhat  without 
compromising  the  way  programs  are  generally  written. 

Natural  semantics. 

Tables  4  and  5  give  a  high-level  description  of  an  interpreter  for  our  language,  written  using  natural 
semantics.  A  natural  semantics  describes  a  partial  function  via  proof  trees.  The  notation  M  JJ.  c, 
read  ‘the  term  M  halts  at  the  final  result  c’,  is  used  when  there  is  a  proof  from  the  rules  with  the 
conclusion  being  M  JJ-  c.  The  terms  at  which  the  interpreter  function  halts  are  called  canonical 
forms;  it  is  easy  to  see  from  the  form  of  the  rules  that  the  canonical  forms  are  n,  true,  false, 
(Ax.  M),  and  (store  M). 

The  natural  semantics  in  Tables  4  and  5  describes  a  call-by-value  evaluation  strategy.  That  is, 
operands  in  applications  are  evaluated  to  canonical  form  before  the  substitution  takes  place.  A 
basic  property  of  the  semantics  is  that  types  are  preserved  under  evaluation; 

Theorem  2  Suppose  h  M  :  s  and  M  JJ-  c,  then  F  c  :  s. 

The  proof  can  be  carried  out  by  an  easy  induction  on  the  height  of  the  proof  tree  of  M  JJ.  c. 
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Table  4:  Interpreting  the  Linear  Core 


As.  M  4  As.  M 

Mi}.d  iV^c 
(dispose  M  before  iV)  IJ.  c 


M  11  As.  P  N  P[s  ff]  ^  c 
(M  iV)  II  c 

Ml^d  P[x,y.=  d]^c 
(share  s,  y  as  M  in  P)  |1  c 


_ Ml  II  Cl  ...  Mn  -d  Cn _ 

(store  N  where  si  =  Mi, . . .  ,s„  =  M„)  |1  (store  iV[si  :=  ci, . . . ,  s„  c„]) 


M  |1  (store  N)  iV  |1  c 
(fetch  M)  |1  c 


Table  5:  Interpreting  the  PCF  Extensions 


true  |1  true  false  |1  false  n  |1  n 

Mil  n  M|l(n  +  1)  M^O 

(succ  M)  |1  (n  +  1)  (pred  M)  |1  n  (pred  M)  |1  0 


M|10 

(zero?  M)  |1  true 


M  Jl(n  +  1) 
(zero?  M)  |1  false 


L  |1  true  M  |1  c  i  |1  false  TV  |1  c 

(if  L  then  M  else  iV)  |1  c  (if  L  then  M  else  IV)  |1  c 


M\  |1  Cl  ...  Mn  |1  Cfi _ M  —  M\x\  . —  Cl,  .  .  . ,  Sn  . —  Cn\ 

(fix  (store  (A /.  As.  M)  where  si  =  Mi, . . . ,  s^  =  M„)) 

|1  (As.  M')[f  :=  (store  (fix  (store  A /.  As.  M')))] 


Semantics 
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4  Semantics 

The  high-level  natural  semantics  is  useful  as  a  specification  for  an  intepreter  for  our  language, 
and  for  proving  facts  like  Theorem  2.  One  would  not  want  to  implement  the  semantics  directly, 
however:  explicit  substitution  into  terms  can  be  expensive,  and  one  would  therefore  use  some 
standard  representation  of  terms  like  closures  or  graphs  in  order  to  perform  substitution  more 
efficiently.  But  there  is  another  problem  with  the  high-level  semantics:  it  does  not  go  very  far  in 
providing  a  computational  intuition  for  the  LL  primitives  in  the  language.  For  example,  the  dispose 
operation  is  treated  essentially  as  ‘no-op’.  As  such,  there  is  no  apparent  relationship  between  these 
connectives  and  memory;  indeed,  the  semantics  entirely  suppresses  the  concept  of  memory. 

In  order  to  understand  what  the  constructs  of  linear  logic  have  to  do  with  memory,  we  construct 
a  semantics  that  relates  the  LL  primitives  to  reference  counting.  In  this  semantics,  the  linear  logic 
primitives  dispose  and  share  maintain  reference  counts.  The  basic  structure  of  the  reference-counting 
interpreter  is  the  same  as  the  one  outlined  in  Section  3.  Environments,  values,  and  storable  objects 
have  the  same  definition  as  before.  Because  we  now  want  to  maintain  reference  counts,  however, 
the  definition  of  stores  must  change.  A  store  is  now  a  function 

a  :  Loc  — >  (N  x  Storable), 

where  the  left  part  of  the  returned  pair  denotes  a  reference  count.  Abusing  notation,  we  use  a{l)  to 
denote  the  storable  object  associated  with  location  I,  and  a[l  5]  to  denote  a  new  store  which  is 
the  same  as  a  except  at  location  /,  which  now  holds  the  storable  object  S  with  the  reference  count 
of  I  left  unaffected.  The  reference  count  of  a  cell  is  denoted  by  refcount(l,  cr).  The  domain  of  a 
store  a  is  the  set 

dom(cr)  =  {I  £  Loc  :  refcount(f,  a)  >1}. 

The  change  in  the  definition  of  ‘store’  forces  an  adjustment  in  the  definition  of  ‘allocation  relation’. 
A  subset  i2  of  the  product  (Storable  X  Store)  X  (Loc  X  Store)  is  an  allocation  relation  if,  for  any 
store  cr  and  storable  object  5,  there  is  an  /'  and  a'  where  (S,cr)  R  (F,  cr')  and 

•  V  ^  dom(a)  and  dorn(cr')  =  dom(cr)  U  {/'}; 

•  for  all  locations  I  £  dom(cr),  <t(/)  =  cr'(f)  and  refcount(/,  cr)  =  refcount(/,  cr');  and 

•  cr'(/')  =  S  and  refcount(/',  cr')  =  1. 

The  basic  structure  underlying  a  store  may  be  captured  abstractly  by  a  graph.  Formally,  a 
graph  is  a  tuple  (F,  E,  s,  t)  where  V  and  E  are  sets  of  vertices  and  edges  respectively  and  s,  t  are 
functions  from  E  to  F  called  the  source  and  target  functions  respectively.  (Note  that  there  may 
be  more  than  one  edge  with  the  same  source  and  target;  such  ‘multiple  edge’  graphs  are  sometimes 
called  multigraphs.)  Given  v  £  V,  the  in-degree  of  v  is  the  number  of  elements  e  £  E  such  that 
t(e)  =  V.  A  vertex  v  is  reachable  from  a  vertex  v'  if  v  =  v'  or  there  is  a  path  between  them,  that 
is,  there  is  a  list  of  edges  ei, . .  .,e„  such  that  v  —  s(ei),  v'  =  t(e„)  and  t{ei)  =  s(ej+i). 

A  memory  graph  ^  is  a  tuple  {V,  E,s,t,[pi, . . . ,  pn])  where  {V,E,s,t)  is  a  graph  together 
with  a  list  of  functions  pi  such  that  each  pi  is  a  function  with  a  finite  domain  and  with  F  as  its 
codomain.  The  functions  pi  are  called  the  root  set  of  the  memory  graph.  Given  v  £  V  and  p,-. 
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root  set 


Figure  3;  A  memory  graph. 


let  \p-  ^(u)|  be  the  number  of  elements  x  in  the  domain  of  pi  such  that  Pt{x)  =  v.  The  reference 
count  of  a  vertex  n  G  F  is  the  sum 


in-degree(z;)  +  ^  |p-  ^(n)l. 

«=i 

A  vertex  in  a  memory  graph  is  said  to  be  reachable  from  pi  if  it  is  reachable  from  an  element  in 
the  image  of  p,-. 

A  state  is  a  triple  {I,p,a)  where  I  is  a  list  of  locations,  p  is  a  list  of  environments  and  a  is 
a  store.  It  is  assumed  that  the  set  of  locations  in  I  and  the  image  of  each  environment  in  p  are 
contained  in  dom((7). 

Definition  3  If  5  =  {I,  p,  a)  is  a  state  where  I  =  [/i, . .  .,/n]  and  p  =  [pi, . . . ,  pm],  then  the  memory 
graph  GiS)  induced  by  S  is  defined  as  follows.  The  vertices  of  the  graph  are  the  locations  in 
dom((T),  and  the  edges  are  determined  by  the  following  definition. 

•  If  1  G  dom((j)  is  such  that  cr(/)  =  susp(/')  or  (t{1)  =  rec(F,  /),  there  is  an  edge  from  I  to  I'. 

•  Suppose  I  G  dom(cr)  is  such  that  <j{l)  =  closure(Af,  p)  or  thunk(A',  p).  Then  for  every  x  G 
dom(p),  there  is  an  edge  from  I  to  p{x). 

Let  /  ;  {1, . . . ,  n}  -+  V  be  given  by  /  :  z  i-+  The  root  set  of  the  induced  memory  graph  is  the 
list  [/,pi, . .  .,Pm]- 

For  instance,  the  state  {l,p,cr)  where  dom(p)  =  {x},  p{x)  =  I",  a[l)  =  thunk{M,[y,  z  >->■  I']), 
a(l')  —  3,  cr{l")  =  susp{l'"),  and  =  true  induces  the  memory  graph  in  Figure  3.  We  wiU  abuse 

notation  and  sometimes  write  Q{cr)  for  the  graph  induced  by  a  alone  (with  no  root  set). 

We  are  primarily  concerned  with  states  that  satisfy  a  collection  of  basic  invariants. 

Definition  4  A  state  S  =  (l,p,a)  is  count-correct  if,  for  each  /  G  dom(cr),  refcount(/,  cr)  is  equal 
to  the  reference  count  of  /  in  G{S). 
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Definition  5  A  state  S  —  (l,p,(r)  is  called  regular,  written  3?(5),  provided  the  following  condi¬ 
tions  hold: 

5  is  count-correct. 

1R2  dom((7)  is  finite. 

3?3  For  each  I  G  dom(c7),  if  cr(/)  =  thunk(M,p),  then  refcount(/,  cr)  =  1. 

3?4  A  cycle  in  the  memory  graph  induced  by  S  arises  only  in  the  form  of  a  rec  and  recclosure 
as  in  Figure  2:  that  is,  it  has  two  nodes  Zq  and  li  such  that  a(lo)  =  rec(/i,/)  and  = 
recclosure(Aa;.  M,p[f  Zq])  for  some  /,  x,  M,  and  p. 

1R5  For  each  Z  G  dom(cr),  if  (j{l)  —  thunk(M, p),  then  the  domain  of  p  is  the  set  of  free  variables  of 
M,  and  M  is  typeable.  Similarly,  if  <t(Z)  =  closure(Aa;.  M,p)  or  recclosure(Aa;.  M,p),  then  the 
domain  of  p  is  the  set  of  free  variables  of  Xx.  M,  and  Xx.  M  is  typeable. 

Here,  a  term  M  is  said  to  be  typeable  if  there  is  some  type  context  F  and  type  t  such  that 
FFM  :Z. 

It  is  convenient  to  abuse  notation  slightly  in  denoting  states  by  writing  locations,  environments, 
and  store  without  grouping  them  as  in  the  official  definition.  For  example,  (Zi,Z2,p, cr,  Z, p)  should 
be  read  as  ::  I2  l,p  P,ct)  (where  is  the  ‘cons’  operation  that  puts  a  datum  at  the  head  of 
a  list).  There  is  no  chance  of  confusion  so  long  as  the  lexical  conventions  distinguish  the  parts  of 
the  tuple,  and  the  locations  and  environments  are  properly  ordered  from  left  to  right.  However, 
the  order  of  these  lists  is  irrelevant  for  regularity:  if  3?(Z,  p,  cr)  and  I',  p'  are  permutations  of  I  and  p 
respectively,  then  lft(f',p',  cr).  We  will  use  this  fact  without  explicit  mention. 


Basic  reference-counting  operations. 


Our  interpreter  will  need  four  auxiliary  functions  to  manipulate  reference  counts.  Two  of  these  func¬ 
tions,  inc  and  dec,  increment  and  decrement  reference  counts.  More  formally,  inc(Z,  cr)  increments 
the  reference  count  of  Z  and  returns  the  resultant  store,  while  dec(Z,cr)  decrements  the  reference 
count  of  Z  and  returns  the  resultant  store.  The  other  two  operations,  inc-env(p,  cr)  and  dec-ptrs(Z,  cr), 
increment  or  decrement  the  reference  counts  of  multiple  cells.  The  formal  definition  of  the  first  of 
these  is 


inc-env(p,  cr) 


(Tn,  where  the  domain  of  p  is  {^i, . . . ,  x„},  and 
cTi  =  inc(p(ii),cr) 


cr„  =  inc(p(x„),cr„_i) 


In  words,  inc-env(p,  cr)  increments  the  reference  counts  of  the  locations  in  the  range  of  p  and  returns 
the  resultant  store.  Note  that  a  location’s  reference  count  may  be  incremented  more  than  once  by 
this  operation,  since  two  variables  may  map  to  the  same  location  Z  according  to  p. 

The  operation  dec-ptrs(Z,  cr),  which  also  returns  an  updated  store,  first  decrements  the  reference 
count  of  location  Z.  If  the  reference  count  falls  to  zero,  it  then  recursively  decrements  the  reference 
counts  of  all  cells  pointed  to  by  Z.  The  formal  definition  appears  in  Table  6;  an  example  appears 
in  Figure  4  where  the  left  side  of  Figure  4  (assumed  to  be  part  of  the  graph  of  the  store  cr)  is 
transformed  into  the  right  side  by  calling  dec-ptrs(Z,  <r).  The  operation  dec-ptrs(Z,  cr)  is  the  single 
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Table  6:  The  Definition  of  dec-ptrs. 

dec(/,  a)  if  a{l)  —  n,  true,  or  false 

dec-ptrs(/',  dec(/,  a))  if  cr(/)  =  susp(/'),  refcount(/,  a)  =  1 

dec-ptrs-env(p,dec(/,<7))  if  (t[1)  =  thunk(M,  p),  refcount(/,  ct)  =  1 

dec-ptrs-env(p,  dec(/,  ct))  if  a{l)  =  closure(Aa:.  M,  p),  refcount(/,  cr)  =  1 

dec-ptrs-env(p,  if  a{l)  =  recclosure(Ax.  iV,p[/  /']), 

dec-ptrs(l,  O’)  =  <  dec(/',  dec(/,  dec(/,  u))))  a[l')  —  rec{l,  f), 

refcount(/,  cr)  =  2,  refcount(r,  cr)  =  1 
dec-ptrs-env(p,  if  cr(/)  =  rec(/', /), 

dec(/',  dec(/,  dec(/,  cr))))  cr(/')  =  recclosure(Ai.  N ,  p[f  i->  /]), 

refcount(l,  cr)  =  2,  refcount(Z',  cr)  =  1 
dec(/,cr)  otherwise 


dec-ptrs-env(p,  cr) 


cr„,  where  the  domain  of  p  is  {xj , . . . ,  x„},  and 
<Ti  =  dec-ptrs(p(xi),  cr) 

cr„  =  dec-ptrs(p(x„),a„_i) 


Figure  4;  An  Example  of  the  dec-ptrs  Operation. 
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most  complex  operation  used  in  the  interpreter.  Other  operations  are  ‘local’  to  parts  of  the  memory 
graph  and  do  not  require  a  recursive  definition.  A  key  characteristic  of  our  semantics  is  the  fact 
that  dec-ptrs(/,  (t)  is  only  used  in  the  rule  for  evaluating  (dispose  M  before  N). 

The  basic  laws  that  capture  the  relationships  maintained  by  the  reference- counting,  allocation, 
and  update  operations  on  states  are  given  in  Table  7.  Most  of  the  laws  are  proven  in  the  appendix, 
but  we  give  the  proof  for  the  Attenuation  Law  A1  here  to  show  how  the  proofs  go.  Suppose 
/,  p,  a),  refcount(/,  cr)  =  1  and  a{l)  =  closure(Ax.  iVjp),  recclosure(Ax.  A,  p),  or  thunk(A,  p). 
Note  first  that  the  state  S'  —  (I,  p,  p,  dec(/,  a))  is  count-correct:  the  environment  p  has  been  placed  in 
the  root  set,  accounting  for  the  edges  coming  out  of  the  closure  or  thunk  which  has  now  disappeared 
from  the  memory  graph.  Thus,  property  5R1  holds  of  state  S'.  Since  dom(cT)  D  dom(dec(/,  cr)),  each 
of  the  properties  follow  directly  from  the  hypothesis.  Thus,  3?(6'').  The  property  is  called 

an  “attenuation  law”  because  pointers  previously  held  inside  the  store  are  drawn  out  to  the  root 
set. 

The  next  goal  is  to  define  an  interpreter  for  the  LL-based  programming  language.  To  understand 
the  interpreter  it  is  essential  to  appreciate  how  the  invariants  influence  its  design.  We  therefore 
describe  the  theorem  that  the  interpreter  is  expected  to  satisfy,  and  mingle  the  proof  of  the  theorem 
with  the  definition  of  the  interpreter  itself.  The  interpreter  is  a  function  interp  which  takes  as  its 
arguments  a  term  M,  an  environment  p,  and  a  store  a.  It  is  assumed  that  the  domain  of  p  is  the 
set  of  free  variables  in  M  and  that  the  image  of  p  is  contained  in  the  domain  of  a.  The  result  of 
interp(M,  p,  cr)  is  a  pair  {l',cr')  where  a'  is  a  store  and  I'  is  a  location  in  the  domain  of  a'  such 
that  cr'(/')  is  a  value,  which  can  be  viewed  as  the  result  of  the  computation.  We  use  a  binary  infix 
@  for  appending  two  lists.  The  theorem  is  stated  as  follows: 

Theorem  6  Let  S  =  {p,cr,l,p)  be  a  state  and  suppose  M  is  a  typeable  term.  If  3?(5)  and 
interp(M,  p.  O')  =  {l',cr'),  then  ^{l'^cr',i,p). 

Moreover,  if  p  =  Pi@P2)  ^  o.nd  I  G  dom(<T)  is  not  reachable  from  p  ::  pi  or  Ji  in  the 

memory  graph  induced  by  S,  then  the  contents  and  reference  count  of  I  remain  unchanged  and  I  is 
not  reachable  from  p\  or  I'  ::  l\  in  the  memory  graph  induced  by  (l',cr',l,p). 

The  first  part  of  the  theorem  says  that  regularity  is  preserved  under  execution  of  typeable  terms. 
The  second  part  of  the  theorem  expresses  what  we  will  call  the  reachability  property.  The 
special  case  of  interest  says  that  the  evaluation  of  a  program  M  in  environment  p  and  store  a  does 
not  affect  locations  in  dom(cr)  that  are  not  reachable  from  p.  The  extra  complexity  of  the  statement 
is  required  to  maintain  a  usable  induction  hypothesis  in  the  proof  of  the  property.  A  simplified 
version  of  Theorem  6  can  be  expressed  as  follows: 

Corollary  7  Suppose  M  is  a  closed,  typeable  term.  If  inteTp{M,9,il))  =  {I',  a'),  then  cr'). 

The  assumption  that  M  is  typeable  is  crucial  in  the  proof  of  the  theorem,  because  untypeable 
terms  may  not  maintain  reference  counts  correctly.  For  instance,  the  term 

(Ax.  (dispose  x  before  x))  (store  1) 

would  cause  a  run-time  error  in  the  maintenance  of  reference  counts— after  the  dispose,  we  would  try 
to  access  a  portion  of  memory  with  reference  count  zero  and  get  a  ‘dangling  pointer’  error.  This 
example  shows  that  untypeable  terms  may  cause  premature  deallocations.  Another  untypeable 
term 

(Ax.  (share  y,z  as  x  in  (dispose  y  before  2)))  (store  1) 
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Table  7:  Memory  Graph  Laws. 

Attenuation  Laws  Suppose  and  refcount(/,  cr)  =  1. 

A1  If  (t(/)  =  closure(  Ax.  N,  p),  recclosure(Ax.  N,p),  or  thunk(  A,  p),  then  p,  p,  dec(/,  cr)). 
A2  If  (j(/)  =  susp(/'),  then  f,  p,  dec(/,  u)). 

Laws  of  Decrement 

Dl  If  3?(/,  I,  p,  a)  and  a{l)  is  a  constant,  then  3?(Z,  p,  dec(/,  a)). 

D2  If  /,  p,  cr)  and  refcount(/,  cr)  7^  1,  then  5?( f,  p,  dec(Z,  cr)). 

D3  If  3?(/,  I,  p,  cr),  then  R.(l,  p,  dec-ptrs(/,  cr)). 

Laws  of  Increment 

11  If  R.{1,  p,  a)  and  /  €  dom(c7),  then  3?(/,  I,  p,  inc(/,  cr)). 

12  Suppose  R.{l,p,a)  and  p(x)  €  dom(cr)  for  all  i  G  dom(p).  Then  1R(Z,  p,  p,  inc-env(p,  cr)). 
Environment  Law 

E  Suppose  X  ^  dom(p).  Then  lR(/,Z,p,p,  cr)  ilf  3f(/,p[x  i-»- /],p,  cr). 

Allocation  Laws 

Nl  If  3?(/,  p,  cr)  and  (Z^cr')  =  new(c,cr)  for  some  constant  c,  then  R{l',l,p,a'). 

N2  Suppose  R(l,  p,p,a)  and  {l',cr')  is  equal  to  new(closure(  A,  p),  cr),  new(thunk(  A,  p),  cr),  or 
new(recciosure(A,  p),cr)  where  FV{N)  —  dom(p).  If  A  is  typeable,  then  R{1' ,1,  p,a'). 

N3  If  3?(/,  I,  p,  cr)  and  (!',  cr')  =  new(susp(/),  cr)  or  new(rec(Z,  /),  cr),  then  3?(Z',  /,  p,  cr'). 
Update  Laws 

Ul  Suppose  5  =  (l,p,a)  and  3?(5)  and  cr(/)  is  a  constant  and  /'  G  dom(cr). 

•  If  /  is  not  reachable  from  I'  in  the  memory  graph  induced  by  S,  then 
SR(/',  I,  p,  inc(/',  cr[/  susp(Z')])). 

•  If  cr(/')  =  recclosure(Ax.  A,  p[f  i->  /]),  then  3?(Z',  I,  p,  inc(Z',  g[1  rec(Z',  /)])). 

U2  If  R.{l,l,p,a),  refcount(/,  cr)  7^  1,  a{l)  —  susp(Z'),  and  cr(/')  =  thunk(A,  p),  then 
lft!(p,  I,  p,  dec(/',  dec(/,  a[l  c]))). 
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causes  a  ‘space  leak’,  i.e.,  the  reference  count  of  the  cell  holding  (store  1)  is  stiU  greater  than  zero 
even  though  it  is  garbage  at  the  end  of  the  execution. 

Interpreting  the  linear  core. 

The  proof  of  Theorem  6  is  by  induction  on  the  number  of  calls  to  the  interpreter.  The  proof 
proceeds  by  considering  each  case  for  the  program  to  be  evaluated. 

The  interpretation  of  a  variable  is  obtained  by  looking  up  the  variable  in  the  environment: 

(1)  interp(x,  p,  a)  =  {p{x),  a) 

That  the  store  {p{x),a',l,p)  is  regular  is  a  consequence  of  the  Environment  Law  E  because  of  the 
assumption  that  the  domain  of  p  is  {2:}.  The  reachability  condition  is  clearly  satisfied,  since  the 
output  store  is  the  same  as  the  input  store. 

To  evaluate  an  abstraction  we  create  a  new  closure,  place  it  in  a  new  cell,  and  return  the  location 
together  with  the  updated  store: 

(2)  interp(Ax.  P,  p,  a)  —  new(closure(Ax.  P,  p),  a) 

To  prove  that  regularity  of  the  state  is  preserved,  suppose  that  =  new(closure(A2;.  P,p),a), 

then  p)  by  Allocation  Law  N2.  The  reachability  condition  is  satisfied  because  the  output 

store  differs  from  the  input  store  only  by  extending  it. 

Given  a  term  P  and  an  environment  p  whose  domain  includes  the  free  variables  of  P,  let  p  |  P 
be  the  restriction  of  the  environment  p  to  the  free  variables  of  P.  The  evaluation  of  an  application 
is  given  as  follows: 

(3)  interp((P  Q),  p,  a)  = 

let  {lo,  (To)  =  interp(P,  p\P,  a) 

(h,  o-i)  =  interp(Q,  p\Q,  ao) 

in  case  cri(/o)  of  closure(Ax.  TV,  p')  or  recclosure(Ax.  TV,  p')  => 
if  refcount(/o)  cri)  =  1 
then  interp(TV,  p'[x  i~*  /j],  dec(/o,  ci)) 
else  interp(TV,  p'[x  /i],  inc-env(p',  dec(/o,  (Ti))) 

The  reader  may  compare  this  rule  to  the  rule  for  application  given  in  Section  3.  The  key  difference  in 
the  semantic  clauses  is  the  manipulation  of  reference  counts:  in  the  rule  here,  a  conditional  breaks 
the  evaluation  of  the  function  body  into  two  causes  based  on  the  reference  count  of  the  location 
that  holds  the  value  of  the  operator,  and  each  branch  of  the  conditional  performs  some  reference¬ 
counting  arithmetic.  The  resulting  semantics  clause  looks  similar  to  a  denotational  semantics  such 
as  that  given  in  [Hud87]  where  information  about  reference  counts  is  included  in  the  semantics 
clauses.  Note  that  the  environment  p  has  been  split  between  the  two  subterms  P  and  Q.  The  fact 
that  (P  Q)  is  typeable  implies  that  p  =  {p\P)l}  {p\Q).  In  various  forms  this  sort  of  property  will 
be  used  repeatedly  in  the  semantic  clauses  below. 

To  prove  the  preservation  of  regularity  of  the  state  for  application,  we  start  with  the  as¬ 
sumption  that  K(p,cr,  [,  p).  This  is  equivalent  to  5?(p|P,p|Q,cr, /,  p).  Now  3?(/o,  p  i  Q,  o-q, /,  p)  and 
3?(/i, /q,  (7i, /,  p)  both  hold  by  induction  hypothesis  (let  us  abbreviate  ‘induction  hypothesis’  as 
‘IH’).  Now,  there  are  two  possibilities  for  the  reference  count  of  Iq  in  <ti,  either  it  is  equal  to 
one  or  it  is  more  than  one.  If  refcount(/o,o'i)  =  1,  then  the  first  Attenuation  Law,  Al,  says  that 
3?(/i,p',  dec(/o,  ai),l,p).  By  the  Environment  Law,  E,  this  implies  that  3?(p'[a:  li],  dec(/o,  cri),l,  p) 
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and  the  desired  conclusion  then  follows  from  IH.  If,  on  the  other  hand,  refcount(lo,  ci)  ^  1,  then 
5R(/i,/9',  inc-env(/9',dec(/,(r)),[,p)  by  12  and  D2.  Hence  h-v  li],[nc-em{p'  ,dec{l,  a))J,  p)  by  E, 

so  we  are  done  by  IH. 

To  see  that  the  reachability  property  holds  for  the  interpretation  of  application,  suppose  I  E 
dom(cr)  is  unreachable  from  p  ::  pi  where  p  =  pi@p2  and  unreachable  from  li  where  I  =  li@l2. 
If  I  is  unreachable  from  {li,p  ::  pi),  then  it  is  unreachable  from  ([i,(p\P)  ::  {p\Q)  pi),  so, 
by  IH,  it  is  unreachable  from  (/q  ::  hiiplQ)  ■■  Pi)  in  the  memory  graph  induced  by  the  state 
resulting  from  the  evaluation  of  P.  A  second  application  of  IH  allows  us  to  conclude  that  it 
is  also  unreachable  from  (/i  Iq  ::  li,pi)  in  the  memory  graph  induced  by  {li,lo,cri,l, p).  By  the 
definition  of  the  memory  graph,  this  impUes  that  I  is  unreachable  from  p'  as  weU,  so  it  is  unreachable 
from  (li,p'[x  1-^  /i],Pi)  in  the  memory  graphs  induced  by  the  states  (p'[x  i-v  /i],  dec(/o,  (7i),  [,  p)  and 
{p'[x  ^  /i],inc-env(p',  dec(/,(7)),Z,p).  The  desired  conclusion  therefore  follows  from  IH.  The  proof 
of  the  reachability  is  similar  for  aU  of  the  remaining  cases,  so  we  wiU  omit  arguing  it  in  the  rest  of 
the  discussion. 

The  expression  (store  N  where  xi  =  Mi,...,x„  =  M„)  is  interpreted  by  first  evaluating  the 
terms  Mi, . . . ,  Mn  to  locations  /i, . . . ,  building  an  environment  that  maps  x,-  to  /,■  for  aU  i,  creating 
a  thunk  out  of  this  environment  and  N ,  and  finally  returning  a  location  holding  a  suspension  of 
this  thunk: 

(4)  interp((store  A  where  xi  =  Afi, . .  .,x„  =  M„),  p,  cr) — 

let  (/i,  cTi)  =  interp(Mi,  p\Mi,  a) 

{In,  o-„)  =  interp(M„,  p\Mn,  an-i) 
p'  =  [xi,...,x„ 

(/n+i,  o-„+i)  =  new(thunk(A,  p'),  <r„) 
in  new(susp(/„+i),  (7n+i) 

To  prove  that  the  desired  property  is  maintained,  note  that  repeated  application  of  the  induction 
hypothesis  allows  us  to  conclude  that  3?(p', a„,p, /).  Therefore,  5R(/n+i,cT„+i,p, /)  by  N2.  Let 
(/„+2,cr„+2)  =  new(susp(l„+i),<T„+i).  Then  5?(/„+2, (r„+2, p, /)  by  N3. 

The  fetch  of  a  suspended  object  is  the  most  complex  of  aU  the  operations.  It  must  evaluate  a 
thunk  if  the  suspension  holds  one.  The  code  is  again  similar  to  that  for  the  interpreter  in  Section  3 
we  examined  earlier,  but,  in  addition  to  the  reference-counting  arithmetic,  there  is  a  clause  dealing 
with  recursion; 

(5)  irLterp((fetch  P),  p,  cr)  = 

let  {Iq,  (To)  =  interp(P,  p,  (x) 
in  case  cro(/o) 

of  susp(/i)  => 

case  (To(/i) 

of  thunk(i?,  p')  => 

if  refcount(/o,  (Tq)  =  1 

then  interp(i2,  p',  dec(/i,  clec(lo,  <ro))) 

else  let  {C,  ctj)  =  interp(f?,  p',  dec(/i,  dec(/o,  cro[/o  h-*- 0]))) 
in  {h,  inc(/2,  <ri[/o  r-r  susp(/2)]))) 

_  =>  if  refcount(/o,  <ro)  =  1 

then  (/i,  dec(/o,  (Tq)) 
else  {li,  inc(/i,  dec(/o,  (Tq))) 

I  |■ec(/l,  /)  =>  (/i,  dec(/o,  inc(/i,  (Tq))) 
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By  IH,  we  have  §?(/o,cro,/,p).  Suppose  <tq(Io)_  =  susp(/i)  and  (7o(/i)  =  thunk(i2,p').  If 
refcount(/o,  o-q)  =  1,  then  dec(/i,  dec(/o,  <to)),  i,p)  by  A1  and  A2  so  we  are  done  by  IH.  Sup¬ 
pose,  on  the  other  hand,  that  refcount(/o,<7o)  /  1-  By  U2,  K(p',  dec(/i, dec(Zo, cro[/o  0])),/, p)  so 
K(/2,  inc(/2,  cri[/o  susp(/2)]), /, p)  by  IH  and  Ul;  the  reachability  property  is  used  to  ensure  the 
applicability  of  Ul.  More  specifically,  in  (Tq  the  location  Iq  is  not  reachable  from  p';  thus,  it  is  not 
reachable  from  I2  in  crj  either,  and  so  (Ti[Io  i-i-  susp(/2)]  does  not  create  an  illegal  loop  in  the  memory 
graph.  The  cases  when  (Jo(/i)  is  a  value  or  c7o(/o)  =  |■ec(/l,/)  are  left  to  the  reader. 

The  share  command  increments  the  reference  count  of  a  location; 

(6)  interp((share  X,  t/ as  P  in  Q),  p,  c)  = 

let  (/o,  (To)  =  interp(P,  p  |  P,  cr) 
in  interp(Q,  (p  \  Q)[x ,  y  Iq],  inc(/o,  (Tq)) 

3?(/o,p  IQ, 0-0,1, p)  by  IH,  so  p\Q,\nc{lo,ao)J,p)  by  II.  Thus  it  follows  from  the  Environ¬ 

ment  Law  E  that  3ff((p  |  Q)[a;,  ?/ 1-^  /o],inc(/o,(To),l,p),  so  the  result  follows  from  IH. 

The  dispose  command  decrements  the  reference  count  of  a  location.  The  requires  calculating 
the  consequences  of  possibly  removing  a  node  from  the  memory  graph  if  its  reference  count  of  the 
disposed  node  falls  to  0. 

(7)  interp((dispose  P  before  Q),  p,  a)  = 

let  (lo,  (To)  =  interp(P,  p|P,  a) 
in  interp(Q,  p|Q,  dec-ptrs(/o,  (Tq)) 

Now,  3?(/o,p|Q,(To,1,p)  by  IH,  so  3?(p  |  Q,dec-ptrs(/o,cro),i,  p)  by  D3.  The  result  therefore  follows 
from  IH. 

Interpreting  PCF  extensions. 

The  interpreter  evaluates  a  constant  simply  by  creating  a  cell  holding  the  value  of  the  constant. 

(8)  interp(7z,  p,  cr)  =  new(n,  cr) 

(9)  interp(true,  p,  cr)  =  new(true,  cr) 

(10)  int erp(fa Ise,  p,  cr)  =  new(false,  cr) 

That  regularity  is  preserved  for  these  cases  follows  immediately  from  Nl. 

The  rules  for  the  arithmetic  and  boolean  operations  of  PCF  mimic  the  rules  of  the  high-level 
operational  semantics. 

(11)  interp((succ  P),  p,  cr)  = 

let  {lo,  (To)  =  interp(P,  p,  cr) 
in  new(cro(/o)  +  1,  dec(/o,  ctq)) 

(12)  interp((pred  P),  p,  cr)  = 

let  {Iq,  (To)  =  interp(P,  p,  cr) 
n  =  cro(/o) 
in  if  n  =  0 

then  new(0,  dec(/o,  <ro)) 
else  new(n— 1,  dec(/o,  cro)) 
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(13)  interp((zero?  P),  p,  a)  = 

let  (/o,  (To)  =  interp(P,  p,  a) 
in  if  (To(/o)  =  0 

then  new(true,  dec(/o,  cq)) 
else  new(false,  dec(/o,  <ro)) 

To  prove  the  desired  property  for  the  successor  operation,  note  that  ao,J,  p)  follows  from  IH 
so  we  are  done  by  D1  and  Nl.  Proofs  for  the  other  two  cases  are  similar. 

The  conditional  statement  has  the  expected  form,  but  the  reference  count  of  the  condition  must 
be  decremented  in  each  of  the  branches: 

(14)  interp(if  TV  then  P  else  Q,  p,  a)  = 

let  (/q,  (To)  =  interp(A^,  p\N,  a) 
in  if  (To(/o)  =  true 

then  interp(P,  p\P,  dec(/o,  uq)) 
else  interp(Q,  p\Q,  dec(/o,  ctq)) 

The  IH  implies  3fi(/o,o-o,/,  p).  Whether  or  not  cro(lo)  =  true,  the  desired  conclusion  follows  from  Dl. 
Finally,  to  interpret  recursion,  we  will  need  a  rule  similar  to  the  rule  for  interpreting  store. 

(15)  interp((fix  (store  (A/.  Ax.  M)  where  xi  =  Ml, . .  =  Mn)),  p,  cr)  = 

let  (/i,  o-i)  =  interp(Mi,  p|Mi,  a) 

{In,  (T„)  =  interp(M„,  pjM„,  (T„_i) 

p'  = 

(In  +  l,  <7„  +  i)  =  new(0,  CTn) 

iln+7,  (T„4.2)  =  new(recclosure(Aa;.  M,  p'[f  /n+i]),  <^n+i) 
in  (/n+2,  inc(/„+2,  0'„+2[^n+i  I-*  rec(/„+2,  /)])) 

As  with  the  interpretation  of  store,  repeated  application  of  the  IH  and  E  implies  that  K(p',  (7„,  l,p). 
By  Nl,  E,  and  N2,  we  therefore  cdso  have  lR(/„+2,crn+2»^p)-  The  desired  conclusion  now  follows 
from  Ul. 
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5  Properties  of  the  Semantics 

In  order  for  the  reference-counting  interpreter  to  make  sense,  it  must  satisfy  a  number  of  invariants 
and  correctness  criteria.  In  this  section  we  describe  these  precisely. 

No  space  leaks. 

As  a  short  example  of  the  kind  of  property  one  expects  the  semantics  to  satisfy,  let  us  consider 
how  the  idea  that  ‘there  are  no  space  leaks’  can  be  expressed  in  our  formalism.  Given  a  state 
5  =  (T,  p,  cr),  we  say  that  a  location  I  is  reachable  from  (I,p)  if  it  is  reachable  in  Q(S)  from  some 
U  El  or  from  some  pj  E  p.  The  desired  property  can  now  be  expressed  as  follows: 

Theorem  8  Suppose  (p,  cr,  /,  p)  is  a  regular  state  such  that  each  I  E  dom(cr)  is  reachable  from 
(p,I,p).  If  M  is  typeable  and  interp(M,p,<7)  =  {l\a'),  then  every  I  E  dom((T')  is  reachable  from 
{I'Xp). 

The  theorem  is  proved  by  induction  on  the  number  of  calls  to  the  interpreter. 

Invariance  under  different  allocation  relations. 

If  the  design  of  the  interpreter  is  correct,  the  exact  memory  usage  pattern  should  be  unimportant 
to  the  final  answers  returned  by  the  interpreter.  Since  the  allocation  relation  new  completely 
determines  memory  usage — i.e.,  which  ceU  (with  reference  count  0)  wiU  be  filled  next — it  should 
not  matter  which  allocation  relation  is  used.  We  set  this  up  formally  as  follows:  if  /  is  an  allocation 
relation,  let  interp j  be  the  partial  interpreter  function  defined  by  using  /  in  the  place  of  new.  Recall 
that  the  environment  and  store  with  empty  domains  are  denoted  by  0.  Then  we  would  like  to  prove 
something  like  the  following  statement  by  induction  on  the  number  of  calls  to  interp^: 

Suppose  /  and  g  are  allocation  relations.  If  interp^(M, 0,0)  =  (Z/,cr/),  then 
interp^(M,  0, 0)  =  (Zg,  Og).  Moreover,  if  (Tfilj)  —  n,  true,  or  false,  then  (7/(Z/)  =  <Jg{lg). 

A  naive  induction  runs  afoul,  though,  since  the  interpreter  can  return  intermediate  results  that 
are  neither  numbers  nor  booleans.  We  therefore  need  to  strengthen  the  induction  hypothesis.  If 
interpy  returns  a  closure  or  suspension,  the  result  returned  by  interp^  may  not  literally  be  the 
same:  for  instance,  interp^  may  return  a  location  holding  susp(Zo)  and  interp^  may  return  a 
location  holding  susp(Zi).  Nevertheless,  these  values  should  be  the  same  up  to  a  renaming  of  the 
locations  in  the  domain  of  the  returned  store  cr'^. 

Formalizing  the  notion  of  when  two  stores  are  ‘equivalent’  up  to  renaming  of  their  locations 
can  be  done  using  the  underlying  graphs.  Two  stores  are  ‘equivalent’  if  their  underlying  graph 
representations  are  isomorphic  via  some  function  h,  and  the  values  held  at  the  cells  are  ‘equivalent’ 
under  h.  More  formally. 

Definition  9  Two  states  S  =  (I,  p,  a)  and  S'  =  (Z',  p',  a')  are  congruent  if  there  is  an  isomorphism 
h  :  Q(<r)  — f  O(cr')  such  that  for  any  Z  E  dom(<T),  refcount(Z,  cr)  =  refcount(/i(Z),  cr')  and  for  any 
Z  E  dom(a), 

1.  For  all  i,  h{li)  =  I'p 

2.  For  aU  z,  dom(p,)  =  dom(p')  and  for  aU  a:  G  dom(pi),  h{pi{x))  =  p'i{x)\ 
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3.  If  a{l)  =  n,  true,  or  false,  then  a(l)  =  a'{h{l))\ 

4.  If  a(l)  =  susp(l'),  then  a'{h{l))  —  susp(h(/')); 

5.  If  a{l)  =  rec(/', /),  then  cr'{h{l))  =  rec(/i(/'), /); 

6.  If  u{l)  =  closure(Ax.  P,p),  then  (j'{h{l))  =  c\osute{\x.  P,p'),  dom(p)  =  dom(p'),  and  for  any 
X  e  dom(p),  p'{x)  =  h(p{x)); 

7.  If  a{l)  -  recclosure(Aa;.  P,p),  then  cr'{h{l))  -  recclosure(Aa:.  P,p'),  dom(p)  =  dom(p'),  and  for 
any  x  G  dom(p),  p'{x)  =  h{p{x))-,  and 

8.  If  <7(/)  =  thunk(P,p),  then  a\h{l))  -  thunk(P,p'),  dom(p)  =  dom(p'),  and  for  any  x  6  dom(p), 
p'{x)  =  h{p{x)). 

Then  one  may  prove 

Lemma  10  Suppose  {!' ,  pj,  p' ,af)  and  {I",  pg,  p",ag)  are  congruent.  7/  interp^(M,  p/,  uy)  = 
then  inteTpg{M,pg,ag)  =  {l'g,cr'g)  and  the  resultant  states  (/y,  P,  p',  cry)  and  {l'g,[",p',a'g) 
are  congruent. 

The  proof  is  deferred  to  the  appendix.  From  this  lemma,  the  following  theorem  follows  directly: 

Theorem  11  Suppose  f  and  g  are  allocation  relations.  If  interpy(M,  0, 0)  =  {lf,crf),  then 
interp^(M,  0, 0)  =  {lg,ag).  Moreover,  if  cr/(//)  =  n,  true,  or  false,  then  cT/(/y)  =  (7g{lg). 

Correctness  of  the  interpreter. 

Finally,  we  need  to  verify  that  the  reference-counting  semantics  implements  the  natural  semantics 
of  Tables  4  and  5,  i.e.,  evaluating  a  closed  term  of  base  type  yields  the  same  result  in  either  se¬ 
mantics.  The  proof  proceeds  by  induction  on  the  number  of  steps  in  the  evaluation  (the  height 
of  the  proof  tree  for  the  (=^)  direction,  and  the  number  of  calls  to  interp  for  the  {<=)  direc¬ 
tion).  We  again  need  an  expanded  induction  hypothesis  to  carry  out  the  proof,  one  in  which 
we  can  relate  the  values  held  in  memory  locations  to  terms.  To  this  end,  we  define  the  extrac¬ 
tion  functions  valof(M,  p,  cr)  and  valofcell(I,  cr).  Intuitively,  the  function  valofcell  extracts  a  term 
from  the  storable  value  held  at  location  I  in  store  a,  and  the  function  valof  replaces  the  free  vari¬ 
ables  of  M  with  the  extracted  versions  of  the  cells  bound  to  the  free  variables  according  to  p. 
The  idea  is  easy  to  understand  intuitively  from  an  example.  Suppose,  for  instance,  cell  Iq  holds 
thunk((dispose  x  before  y),  [x>-*  h,y^  I2]),  k  holds  susp(/3),  I3  holds  0,  and  h  holds  true  in  the 
store  a.  Then  valofcell(/o,  cr)  =  (dispose  ((store  0))  before  true).  A  larger  example  appears  in 
Figure  5  (where  reference  counts  have  been  ignored);  if  a  is  the  store  depicted  there,  then 

valofcell(/,  a)  = 

A/.  (((Ah.  Xy.  (share  as  h  in  fii(/i2  J/)))  /)  (store  {(Xx.x)  true))) 

Formal  defintions  for  valof  and  valofcell  are  given  by  simultaneous  induction  in  Table  8  in  the 
appendix.  A  similar  definition  is  given  in  [Plo75]  for  unwinding  a  closure  relative  to  an  SECD 
machine  state. 

Since  we  will  be  interpreting  terms  of  arbitrary  type,  the  induction  hypothesis  must  relate  values 
returned  by  the  natural  semantics  to  values  returned  by  the  reference- counting  interpreter.  The 
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closure(A/i.  Ay.  (share  hi,h2  as  h  in  hi(h2  y)),  0) 
closure(A/.  {{g  /)  x),  [g  i-y  Ig,  x  y ) 
susp(/^.) 

thunk((/ true),  [f  I']) 
closure(Aa;.  x,  0) 


Figure  5:  Store  for  Example  of  the  valofcell  Operation. 


key  definition  missing  here  is  the  definition  of  ‘related  values’.  One  might  attempt  to  extend  the 
statement  of  the  theorem  directly — that  is,  for  closed  terms,  M  JJ-  c  iff  int6rp(M,  0, 0)  =  (I',  a') 
and  valofcell(/',  ct')  =  c.  While  this  statement  holds  for  basic  values,  it  does  not  hold  for  values  of 
other  types.  The  problem  arises  because  the  reference- counting  interpreter  memoizes  the  results  of 
evaluating  under  store’s  whereas  the  natural  semantics  does  not.  For  instance,  evaluating  the  term 

(Ax.  (share  y,z  as  x  in  if  (zero?  (fetch  y))  then  z  else  z))  (store  (succ  5)) 

in  the  natural  semantics  returns  the  value  (store  (succ  5)),  whereas  evaluating  the  expression  in  the 
reference-counting  semantics  returns  the  value  (after  unwinding)  (store  6).  The  proof  thus  requires 
relating  terms  that  are  ‘less  evaluated’  to  terms  that  are  ‘more  evaluated’. 

Definition  12  M  >  N ,  read  ‘iV  requires  less  evaluation  than  M’,  iff  M  =  C[M'],  N  =  C[c],  M' 
is  closed,  and  M'  JJ-  c. 

where  C[  ]  denotes  a  term  with  a  missing  subterm  and  C{M']  the  term  resulting  from  using  M'  for 
that  subterm.  Let  >*  be  the  reflexive,  transitive  closure  of  >.  This  relation  is  necessary  in  order 
to  express  the  desired  property: 

Theorem  13  Suppose  M  is  typeable,  dom(/?)  =  FV{M),  M'  is  closed,  and  M'  >*  valof(M, p,  cr). 
Suppose  also  that  iR.{T' ,  p,  p',a). 

1.  If  M'  J].  c,  then  interp(M,  p,  cr)  =  {l',cr')  and  c  >*  valofcell(F,  cr'). 

2.  If  interp(M,  p,  cr)  =  (/',  a'),  then  M'  IJ-  c  >*  valofcell(/',  cr'). 

The  extra  assumptions  about  the  state  (/',p,p',  cr) — namely  that  it  satisfies  the  invariants  above — 
are  used  in  constructing  an  execution  in  the  reference- counting  interpreter.  The  proof  is  deferred 
to  the  appendix. 
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6  Linear  Logic  and  Memory 

Let  us  now  examine  the  question  of  the  circumstances  under  which  we  are  ensured  that  a  location 
holding  a  value  of  linear  type  will  maintain  a  reference  count  of  at  most  one.  In  general,  there  is 
no  guarantee  that  locations  holding  linear  values  will  always  have  a  reference  count  of  one  during 
the  evaluation  of  a  program.  Consider,  for  example,  the  term 

(\w.  (share  x,y  as  w  in  if  (zero?  (fetch  y))  then  x  else  x))  (store  5). 

During  evaluation,  a  suspension  is  placed  in  a  location  /,  which  in  turn  holds  a  pointer  to  a  location 
/'  holding  a  thunk  containing  the  value  5.  This  location  I  is  then  passed  to  w,  and  two  pointers 
called  X  and  y  are  then  created  by  the  share  which  reference  1.  Pictorially, 

y 

X 


When  the  evaluation  continues  to  the  point  of  (fetch  y),  the  contents  of  the  location  /'  are  evaluated 
to  a  location  V  holding  5,  the  suspension  in  I  is  updated  to  point  to  V ,  and  a  pointer  to  I"  is  then 
passed  to  the  evaluation  of  zero?.  Pictorially, 


Thus,  the  ceU  containing  5  now  has  two  pointers  to  it,  even  though  it  ha.s  linear  type,  Nat. 

Clearly  the  issue  here  is  whether  the  location  holding  a  linear  value  is  accessible  from  a  location 
holding  a  non-linear  one,  like  a  susp.  We  would  like  a  static  condition  under  which  we  know  that 
this  does  not  happen.  This  seems  difficult  because,  on  the  face  of  it,  there  are  circumstances  where 
a  computation  can  alter  the  memory  graph  so  that  a  linear  value  is  brought  into  a  location  that  is 
referenced  by  a  non-linear  value.  Consider  the  term: 

M  =  Xx  :  Nat.  Xf  :  Nat-o!Nat.  (store  y  where  y  —  {f  x))  (1) 

If  A'’  is  a  term  of  type  Nat-o!Nat,  then  the  evaluation  of  ((M  0)  N)  may  create  a  memory  graph 
in  which  the  location  holding  0  has  been  brought  into  precisely  the  circumstance  above;  so  its 
reference  count  might  be  increased  by  pointers  passed  through  a  susp.  We  need  to  know  when  this 
can  happen  if  we  are  to  have  any  way  to  ensure  that  a  linear  value  maintains  a  reference  count  of 
at  most  one. 
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There  is  some  help  on  this  point  to  be  found  in  the  proof  theory  of  linear  logic.  Note,  that  the 
problem  with  term  M  in  (1)  relies  on  having  a  term  N  of  type  Nat-o!l\lat.  From  the  stand-point 
of  linear  logic  and  its  translation  under  the  Curry-Howard  correspondence,  this  is  a  suspicious 
assumption,  however.  The  proposition  A-olA  is  not  provable  in  LL,  and  the  situation  illustrated 
by  M  runs  contrary  to  proof-theoretic  facts  about  what  propositions  are  moved  through  ‘boxes’ 
in  a  proof  net  during  cut  elimination  [Gir87].  This  does  not  directly  prove  that  a  static  property 
exists  for  the  LL-based  programming  language,  but  it  does  suggest  that  there  is  hope. 

To  assert  the  desired  property  precisely,  we  will  need  some  more  terminology.  Let  us  say  that 
a  storable  object  is  linear  if  it  is  a  numeral,  boolean,  closure,  or  recclosure  and  say  that  it  is  non¬ 
linear  if  it  has  the  form  susp(/),  rec(/,/),  or  thunk(M,p).  We  say  that  a  location  I  is  non-linear 
in  store  a  if  a{l)  is  a  non-linear  object;  similarly,  a  location  /  is  linear  in  store  a  if  a{l)  is  a 
linear  object.  The  key  property  concerns  the  nature  of  the  path  in  the  memory  graph  between  a 
location  and  the  root  set. 

Definition  14  Suppose  S  =  (1,0,1, p)  is  a  regular  state  and  I  6  dom(cr).  The  location  I  is  said  to 
be  linear  from  /  in  5  if  there  is  a  path  p  from  Z  to  /  in  G{S)  such  that  each  I'  on  p  satisfies  the 
following  two  properties: 

1.  a(l')  is  linear  and 

2.  refcount(/',  a)  =  1. 

Note  that  the  two  conditions  satisfied  by  the  path  p  could  only  be  satisfied  by  a  unique  path  from  I 
to  1;  if  there  were  more  than  one  such  path,  condition  (2)  could  not  be  satisfied.  It  will  be  convenient 
to  say  that  a  path  satisfying  these  conditions  is  linear.  Given  a  regular  state  S  =  (p,a,l,p),  we 
also  say  that  /  is  linear  from  p  in  5  if  there  is  an  x  in  the  domain  of  p  such  that  there  is  a  (unique) 
linear  path  from  p(x)  to  1. 

To  prove  the  desired  property  we  will  need  to  know  some  basic  facts  about  types  and  evaluation. 
For  the  high-level  semantics  we  already  expressed  the  Subject  Reduction  Theorem  2  for  the  LL- 
based  programming  language.  In  conjuction  with  the  Correctness  Theorem  13  we  have  a  version 
of  the  result  for  the  low-level  semantics  a.s  well: 

Lemma  15  Suppose  S  =  (l,<J,l,p)  is  a  regular  state,  dom(/j)  =  FV(M),  h  valof(M, p,  ct)  :  t,  and 
intexp(M ,  p ,  a)  =  (V ,  o') .  Then  h  valofcell(/',cr')  :  t. 

The  theorem  we  wish  to  express  says  that  if  a  program  is  evaluated  in  an  environment  from 
which  a  location  I  is  linear,  then  the  value  at  the  location  is  either  used  and  deallocated  or  not  used 
and  linear  from  the  location  returned  as  the  result  of  the  evaluation.  This  statement  is  intended 
to  formally  capture  the  idea  that  a  location  that  is  linear  from  an  environment  is  used  once  or  left 
untouched  with  a  reference  count  of  one.  Unfortunately,  the  assertion  contains  the  term  ‘deallocate’, 
which  needs  to  be  made  precise.  If  we  assert  instead  that  the  reference  count  of  the  location  is  0 
or  linear  from  the  result  at  the  end  of  the  computation,  then  there  is  a  problem  in  the  case  where 
reference  count  falls  to  0  because  the  allocation  relation  might  reallocate  the  location  I  to  hold  a 
value  that  is  unrelated  to  the  one  placed  there  originally.  This  would  make  it  impossible  to  assert 
anything  interesting  about  the  outcome  of  the  computation.  To  resolve  this  worry,  we  can  make  a 
restriction  on  the  allocation  relation  insisting  that  I  is  not  in  its  range.  This  assumption  is  harmless 
in  a  sense  made  precise  by  Theorem  10.  The  result  of  interest  can  now  be  asserted  precisely  as 
follows: 
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Theorem  16  Suppose  S  =  {p,a,l,p)  is  a  regular  state,  dom(p)  =  FV{M),  and  valof(M,  p,  a)  is 
typeable.  If  I  is  linear  from  p  in  S,  and  I  is  not  in  the  range  of  new,  and  interp(M,  p,  ct)  =  {V ,cr'), 
then  one  of  the  following  two  properties  holds  of  the  regular  state  S'  =  (/',  a',  l,p): 

1.  Either  refcount(/,  a')  =  0,  or 

2.  refcount(/,  cr')  =  1  and  I  is  linear  from  I'  in  S'. 

Proof:  The  proof  is  by  induction  on  the  number  of  calls  to  interp.  We  exhibit  only  a  few  of  the 
key  cases  here  and  leave  the  others  for  the  reader. 

1.  M  =  {P  Q).  The  evaluation  of  M  begins  as  follows: 

interp(P,  p  |  P,  cr)  =  (/q,  (Tq) 

interp((3,p|(5,Oo)  =  (/i,cri) 

cri(/o)  =  closure(Aa:.  iVjp')  or  recclosure(Ax.  iV,  p'). 

The  fact  that  /  is  linear  from  p  means  that  it  is  reachable  from  exactly  one  of  p  |  P  or  p  |  (J. 
We  consider  the  two  cases  separately. 

(a)  I  is  reachable  from  p  |  P.  By  the  induction  hypothesis  (TH’),  one  of  the  following  two 
subcases  apphes: 

i.  refcount(/,  (To)  =  0.  By  assumption,  I  is  never  reallocated  by  new,  and  hence  it 
follows  that  refcount(/,  ct')  =  0. 

ii.  refcount(/,  c7o)  =  1  and  I  is  linear  from  /q.  Then  in  the  memory  graph,  there  is  a 
linear  path 

lo  =  l'oJ[,...,l 

(where  we  hst  only  the  locations  associated  with  the  path  since  the  fact  the  reference 
counts  are  all  equal  to  one  means  that  the  edges  are  uniquely  determined).  None  of 
the  locations  can  be  reachable  from  p  |  Q  since  that  would  imply  that  the  reference 
count  of  at  least  one  of  them  is  greater  than  one.  By  Theorem  6,  the  contents  and 
reference  counts  of  the  locations  therefore  do  not  change  during  the  evaluation  of 
Q.  Now,  /  is  hnear  from  Iq  in  {h,lo,(T\,l,p)  and  cri(/o)  has  the  form  closure(Aa;.  N ,  p') 
or  recclosure(A2;.  iV,  p'),  so  I  must  be  linear  from  p'  in  (p'[x  li],  dec(/o,  cti), /,  p)  as 
well.  Since  we  know  that  refcount(/o,  cti)  =  1,  we  conclude  that 

interp(iV,p'[x  /j],  dec(/o,  ui))  =  {I',  a') 

and  the  desired  conclusion  follows  from  IH. 

(b)  /  is  reachable  from  p\Q.  By  assumption,  there  is  a  hnear  path 

V  V  I 

such  that  /q  is  in  the  image  of  p\Q.  None  of  the  locations  on  this  path  is  reachable 
from  p  I  Q  because  they  aU  have  reference  count  equal  to  one.  Thus,  by  Theorem  6,  their 
values  are  unchanged  by  the  evaluation  of  P,  and  each  /'  is  stiU  unreachable  from  Iq  in 
o-Q.  By  IH,  there  are  two  possibihties  regarding  the  regular  state  (/i,  /q,  <7i,  /,  p)  obtained 
after  evaluating  P  and  Q. 
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i.  refcount(/,  ai)  —  0.  By  assumption  /  is  never  reallocated  by  new,  so  refcount(/,  a')  =  0 
as  needed. 

ii.  refcount(/,  cTi)  =  1,  In  this  case,  the  IH  implies  that  there  is  a  linear  path  from 
/i  to  /.  There  are  now  two  subcases  to  consider:  either  refcount(/o,  cto)  =  1  or 
refcount(lo,  (^o)  >  1-  We  consider  only  the  second  and  leave  the  first  to  the  reader. 
By  laws  D2,  12,  and  E,  we  know  that  the  state 

S'  =  {p'[x  ^  /i],inc-env(p',  dec(/o,cri)),[,p) 

is  regular  and  it  is  hot  hard  to  check  that  /  is  linear  from  p'[x  i-»-  Z^]  in  S' .  Since  we 
must  have 

interp(iV,p'[x  Zi],  inc-env(p',  dec(/o,  ni)))  =  {l',cr') 
we  are  done  by  IH. 

2.  M  =  (store  N  where  xi  =  M-i,. .  .,Xn  =  Mn)-  In  this  case,  Z  is  reachable  from  exactly  one  of 
the  environments  p  |  M,.  In  the  evaluation  of  M,  we  have 

interp(Mi,p|  Jkfi,£T)  =  {h,cri) 
interp(Mi,/)|Mj,c7,_i)  = 

By  IH,  there  are  two  possibilities  for  the  regular  state 

(Z^ , . . . ,  Z,',  p  I  , . . . ,  p  I  Mn »  p) 

arising  after  the  evaluation  of  Mi.  Either  the  reference  count  of  Z  is  zero  in  n,-  or  it  is  one 
and  there  is  a  linear  path  from  Z,-  to  Z.  If  the  first  case  holds,  then  we  are  done,  since  Z  is  not 
reallocated  in  the  remainder  of  the  computation,  and  therefore  the  conclusion  of  the  theorem 
is  satisfied.  On  the  other  hand,  the  second  case  is  impossible:  by  Lemma  15,  valofcell(Z,-,  cr,) 
has  type  It  and  cr,(Zi)  is  a  value,  so  it  has  the  form  susp(Z")  or  rec(Z",  /).  This  contradicts  the 
assumption  that  Z  is  linear  from  Z,-.  Therefore  reference  count  of  Z  must  be  0  in  cr,-  and  hence 
we  are  done,  since  new  never  reallocates  Z. 

3.  M  =  (share  x,y  as  P  in  Q).  In  the  evaluation  of  M  we  compute 

intGrp(P,  p  I  P,  cr)  =  (Zq,  cto) 

interp(Q,  (p  |  Q)[x,  y  Zq],  inc(Zo,  cto))  =  (Z',  a') 

Now  Z  is  reachable  for  exactly  one  of  the  environments  p|P  or  p\Q.  We  consider  the  two 
cases  separately. 

(a)  Z  is  reachable  from  p  |  P.  For  the  same  reasons  discussed  in  the  case  for  store  above,  IH 
implies  that  refcount(Z,  cto)  =  0,  and  thus  we  are  done  since  new  never  reallocates  Z. 

(b)  Z  is  reachable  from  p  |  Q.  Then  there  is  a  linear  path  from  p  |  Q  to  Z  which,  by  Theorem  6, 
is  unaffected  by  the  evaluation  of  P.  In  particular,  I  is  not  reachable  from  Zq,  so  it  is 
linear  from  p  |  Q  in  the  regular  state  ((p  |  Q)[x,  y  Iq],  inc(Zo,  no),  Z,  p)  so  we  are  done  by 
IH. 
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The  remaining  cases  are  treated  similarly.  ■ 

To  see  an  example  of  how  the  theorem  can  be  applied  to  reasoning  about  properties  that  depend 
on  the  memory  graph,  suppose  we  want  to  evaluate  add  (store  2)  3  in  the  empty  environment  and 
empty  store.  The  key  steps  are 

•  add  (store  2)  evaluates  to  {lo,<ro)  with  o'o(/o)  =  closure(Aa;.  N,p). 

•  3  in  (To  evaluates  to  (/i,cri)  such  that  (Ti(/i)  =  3. 

•  The  body  of  add  is  evaluated  with  y  mapped  to 

At  this  point  the  conditions  required  for  the  theorem  above  are  true.  Hence  we  know  that  the 
reference  count  of  li  does  not  exceed  one  (so  long  as  it  is  not  deallocated  and  then  reallocated). 
This  implies  that  it  is  safe  to  update  y  in  place  during  the  recursive  call.  Similar  analysis  applies 
to  definitions  of  multiplication  and  other  recursive  functions  where  we  use  a  variable  as  an  accu¬ 
mulator  to  store  the  result.  This  technique  of  proof  allows  us  to  achieve  goals  hke  those  for  which 
Hudak  [Hud87]  defined  a  collecting  interpretation  for  reference  counts. 
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7  Discussion 

For  this  paper  we  have  chosen  a  particular  natural  deduction  presentation  of  linear  logic.  Oth¬ 
ers  have  proposed  different  formulations  of  hnear  logic,  and  it  would  be  interesting  to  carry  out 
similar  investigations  for  those  formulations.  For  instance,  Abramsky  [Abr]  has  used  the  sequent 
formulation  of  linear  logic.  His  system  satisfies  substitutivity  because  this  is  essentially  a  rule  of 
the  sequent  presentation  (the  cut  rule  to  be  precise),  but  there  is  no  clear  means  of  doing  type 
inference  for  his  language.  Others  [Mac91,  LM92]  have  attempted  to  reconcile  the  problems  of  type 
inference  and  substitutivity  by  proposing  restricted  forms  of  these  properties.  Another  approach 
has  been  to  modify  linear  logic  by  adding  new  assumptions.  For  instance,  [Wad91a]  and  [0’H91] 
propose  taking  !!A  to  be  isomorphic  to  !A;  from  the  perspective  of  this  paper,  such  an  identification 
would  collapse  two  levels  of  indirection  and  suspension  into  one  and  hence  fundamentally  change 
the  character  of  the  language.  Other  approaches  to  the  presentation  of  LL  seem  to  have  compatible 
explanations  within  our  framework,  but  might  yield  slightly  different  results.  For  example,  there 
is  a  way  to  present  LL  using  judgements  of  the  form  F;  A  F  5  where  F  is  a  set  of  ‘intuitionistic 
assumptions’  (types  of  non-linear  variables)  and  A  is  a  multi-set  of  ‘hnear  assumptions’  (types  of 
hnear  variables).  This  approach  might  suit  the  results  of  Section  6  better  than  the  presentation  we 
used  in  this  paper  because  it  singles  out  the  hnear  variables  more  clearly  and  provides  what  might 
be  a  simpler  term  language.  On  the  other  hand,  the  connection  with  reference  counts  is  less  clear 
for  that  formulation. 

It  is  also  possible  to  fold  reference-counting  operations  into  the  interpretation  of  a  garden  variety 
functional  programming  language  (that  is,  one  based  on  intuitionistic  logic).  The  ways  in  which  the 
result  differs  from  the  semantics  we  have  given  for  an  LL-based  language  are  ihuminating.  First  of 
all,  there  are  several  choices  about  how  to  do  this.  One  approach  is  to  maintain  the  invariant  that 
interp  is  evaluated  on  triples  {M,p,a)  where  the  domain  of  p  is  exactly  the  set  of  free  variables 
of  M.  When  evaluating  an  application  M  =  {P  Q),  for  example,  it  is  essential  to  account  for  the 
possibility  that  some  of  the  free  variables  of  M  are  shared  between  P  and  Q.  This  means  that 
when  P  is  interpreted,  the  reference  counts  of  variables  they  have  in  common  must  be  incremented 
(otherwise  they  may  be  deallocated  before  the  evaluation  of  Q  begins): 

interp((P  Q),  p,  a)  - 

let  (/o,  (To)  =  interp(P,  p\P,  inc-env(p  [  P  n  p  |  Q,  cr)) 

(k,  (Ti)  =  interp((5,  p\Q,  (Tq) 

in  case  ai{lo)  of  closure(Ax.  A,  p')  or  reccIosure(Ax.  A,  p')  => 
if  refcount(/o,  (Ti)  =  1 
then  interp(A,  p'[x  i— *■  /i],  dec(/o,  (Ti)) 
else  interp(A,  p'[x  i— >  /j],  inc-env(p',  dec(/o,  (Ti))) 

The  deallocation  of  variables  is  driven  by  the  requirement  that  only  the  free  variables  of  M  can  lie 
in  the  domain  of  p;  this  arises  particularly  in  the  semantics  for  the  conditional: 

interp(if  A  then  P  else  Q,  p,  a)  = 

let  (/o,  (To)  =  interp(A,  p  |  A,  inc-env(plA  (1  (p|P  U  p|(3),  a)) 
in  if  (To(/o)  =  true 

then  interp(P,  p  |  P,  dec(/o,  dec-ptrs-env((p|P)  —  (pjQ),  (Tq))) 
else  interp(Q,  p\Q,  dec(/o,  dec-ptrs-env((p|(5)  -  (pjP),  (Tq))) 

An  alternative  approach  to  providing  a  reference-counting  semantics  for  an  intuitionistic  language 
would  be  to  delay  the  deallocation  of  variables  until  ‘the  last  minute’  and  permit  the  application 
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of  interp  to  triples  {M,p,  a)  where  the  domain  of  p  includes  the  free  variables  of  M  but  may  also 
include  other  variables.  This  makes  it  possible  to  simplify  the  interpretation  of  the  conditional: 

interp(if  N  then  P  else  Q,  p,  a)  — 
let  (/o,  (To)  =  iiiterp(7V,  p,  cr) 
in  if  (To(/o)  =  true 

then  interp(P’,  p,  dec(/o,  (Tq)) 
else  interp((5,  p,  dec(/o,  (Xq)) 

but  the  burden  of  disposal  then  shifts  to  the  evaluation  of  constants: 
interp(n,  p,  cr)  =  new(n,  dec-ptrs-env(/9,  a)) 

The  basic  difference  between  a  ‘reference-counting  interpretation  of  intuitionistic  logic’  following 
one  of  the  approaches  just  described  versus  reference  counting  and  hnear  logic  is  the  way  in  which 
the  LL  primitives  make  many  distinctions  explicit  in  the  code.  The  LL  primitives  make  it  possible 
to  describe  certain  kinds  of  ‘code  motion’  that  concern  when  memory  is  deallocated.  For  example, 
the  program 

Xx  :  s.  \{  B  then  (dispose  x  before  P)  else  (dispose  x  before  Q) 
can  be  shown  to  be  equivalent  in  the  higher-level  semantics  to 

Xx  :  s.  (dispose  x  before  if  B  then  P  else  Q) 

but  the  latter  program  can  be  viewed  as  preferable  in  the  reference-counting  semantics,  because  it 
may  deallocate  the  locations  referenced  by  x  sooner.  As  another  example,  the  program 

Ax  :  s.  (dispose  y  before  M) 


is  equivalent  to 

(dispose  y  before  Ax  :  s.  M) 

if  X  and  y  are  different  variables.  The  transformation  may  be  significant  if  the  value  of  y  would  be 
deallocated  rather  than  needlessly  held  in  a  closure. 

Proving  that  programs  like  the  ones  above  are  equivalent  as  far  as  the  high-level  semantics  is 
concerned  can  be  facilitated  by  a  fixed-point  (denotational)  semantics  for  the  LL-based  language. 
A  reasonable  semantics  of  this  kind  can  be  given  a.s  an  extension  of  the  semantics  of  caU-by- value 
with  the  operation  !s  being  interpreted  as  the  /r/fmg  operation  on  domains.  For  such  a  semantics  it 
is  possible  to  extend  the  adequacy  result  using  the  standard  techniques  (as  in  section  6.2  of  [Gun92] 
or  11.4  of  [Win93]). 

The  question  of  whether  an  LL-based  language  could  be  useful  as  an  intermediate  language  for 
compiler  analysis  for  intuitionistic  programming  languages  is  certainly  related  to  the  techniques 
for  translating  between  them.  By  analogy,  there  have  been  various  studies  of  the  subtleties  of 
transformation  to  CPS  ([LD93]  is  a  one  recent  example).  A  closer  analogy  is  the  translation  of  a 
language  meant  to  be  executed  in  call-by-name  into  a  call-by-value  language  with  primitives  for 
delaying  (store ’ing)  and  forcing  (fetch ’ing).  There  is  a  standard  translation  for  this  purpose  and 
many  of  the  issues  that  arise  for  that  translation  also  arise  in  the  translation  from  intuitionistic 
to  hnear  logic.  For  instance,  a  pair  of  programs  that  are  strongly  reminiscent  of  those  in  Table  1 
appears  in  the  discussion  of  the  ALFL  compiler  in  [BHY88]  based  on  a  sample  from  the  test 
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suite  in  [Gab85].  This  problem  is  addressed  by  the  technique  of  strictness  analysis  [AH87]:  with 
strictness  analysis  the  translation  can  be  made  more  efficient  or  the  translated  program  can  be 
optimized.  There  are  several  techniques  known  for  translating  intuitionistic  logic  into  linear  logic. 
To  illustrate,  consider  the  combinator  S  (here  written  in  ML  syntax): 

fn  X  =>  f n  y  =>  f n  z  =>  (x  z) (y  z) 

When  we  apply  Girard’s  translation,  the  result  (using  a  syntax  similar  to  the  one  in  Table  1)  is  the 
following  program: 

fn  X  =>  fn  y  =>  fn  z  => 
share  zl,z2  as  z  in 

((fetch  x)  (store  (fetch  zl))) 

(store  ((fetch  y)  (store  (fetch  z2)))) 

However,  another  program  having  S  as  its  ‘erasure’  is 

fn  X  =>  fn  y  =>  fn  z  => 

share  zl,z2  as  z  in  (x  zl)(y  z2) 

which  is  evidently  a  much  simpler  and  more  efficient  program.  An  analog  of  strictness  analysis  that 
applies  to  the  LL  translation  is  clearly  needed  if  an  LL  intermediate  language  is  to  be  of  practical 
significance  in  analyzing  ‘intuitionistic’  programs. 

Our  reference-counting  interpreter  and  the  associated  invariance  properties  can  easily  be  ex¬ 
tended  to  the  linear  connectives  Si,  0,  and  0  (although  it  is  unclear  how  to  handle  the  ‘classical’ 
connectives).  Extending  the  results  to  dynamic  allocation  of  references  and  arrays  is  not  difficult 
if  such  structures  do  not  create  cycles.  For  instance,  it  can  be  assumed  that  only  integers  and 
booleans  are  assignable  to  mutable  reference  cells.  To  see  this  in  a  little  more  detail,  if  we  assume 
that  o  is  Nat  or  Bool,  then  typing  rules  can  be  given  as  follows: 

rh  M:o  ThM  :  ref(o)  N  :o  F  h  M  :  ref(o) 

r  h  ref(M)  :  ref(o)  T  M  :=  N  :  ref(o)  F  h  !M  :  o 

To  create  a  reference  cell  initialized  with  the  value  of  a  term  M,  the  term  M  is  evaluated  and  its 
value  is  copied  into  a  new  cell: 

(16)  interp(ref(M),  p,  a)  = 

let  (lo,  Co)  =  interp(M,  p\M,  a) 
in  new((To(/o),  dec(/o,  cq)) 

The  location  Iq  holds  the  immutable  value  of  M;  a  new  mutable  cell  must  be  created  with  the  value 
of  M  as  its  initial  value.  Assignment  mutates  the  value  associated  with  such  a  cell: 

(17)  interp(M  :=  N,  p,  cr)  = 

let  {lo,  Co)  =  interp(M,  p\M,  cr) 

Ih,  Cl)  =  interp(A,  p\N,  ci) 
in  {lo,  dec(/i,  ci[/o  Ci(/i)])) 

To  obtain  the  value  held  in  a  mutable  cell  denoted  by  M,  the  contents  of  the  cell  must  be  copied 
to  a  new  immutable  cell: 
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(18)  interp(!M,  p,  it)  = 

let  (Iq,  (To)  =  interp(M,  p\M,  a) 
in  new(c7o(/o),  dec(/o,  (Tq)) 

Although  the  code  for  creating  a  reference  cell  and  the  code  for  dereferencing  look  the  same,  they 
are  dual  to  one  another  in  the  sense  that  ceU  creation,  ref(M),  copies  the  contents  of  an  immutable 
cell  to  a  mutable  one  while  dereferencing,  \M  copies  the  contents  of  a  mutable  cell  to  an  immutable 
one.  The  language  designed  in  this  way  is  similar  to  Scheme  with  force  and  delay  primitives,  but 
with  restrictions  hke  those  of  ML  on  which  values  are  mutable.  The  restriction  on  the  types  of 
elements  held  in  reference  cells  is  similar  to  those  made  for  block-structured  languages,  which  do 
not  permit  higher-order  procedures  to  be  assigned  to  variables  (reference  cells). 

In  conclusion,  we  have  demonstrated  that  a  language  whose  design  is  guided  by  an  analog  of  the 
Curry-Howard  correspondence  applied  to  linear  logic  can  be  interpreted  as  providing  fine-grained 
information  about  reference  counts  in  the  memory  graphs  produced  by  the  program  during  run¬ 
time.  As  such,  the  LL-based  language  may  be  useful  for  detecting  or  proving  the  correctness  of 
forms  of  program  analysis  that  rely  on  reference  counts  of  nodes  of  memory  graphs.  As  a  secondary 
theme  we  have  illustrated  an  approach  to  expressing  and  proving  properties  of  programs  at  a  level 
of  abstraction  in  which  properties  of  memory  graphs  are  significant  but  some  lower-level  properties, 
such  as  memory  layout,  are  abstracted  away.  Isolating  this  level  of  abstraction  could  be  useful  for 
correctness  proofs  of  lower  levels,  such  as  the  correctness  of  a  memory  allocation  scheme. 
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A  Proofs  of  the  Main  Theorems 

Verification  of  the  Basic  Laws  in  Table  7 

Proposition  17  Each  of  the  laws  Al,  A2,  Dl,  D2  given  in  Section  f  hold. 

Proof:  The  proof  of  Al  may  be  found  in  Section  4,  and  the  proof  of  A2  is  similar.  We  thus  need 
only  to  verify  Dl  and  D2. 

Dl  Suppose  S  =  {lj,p,a-),  3f?(5)  holds,  <t(/)  is  a  numeral  or  boolean,  and  S'  =  {I, p,dec[l,a)). 
Note  that  there  are  no  outgoing  edges  from  I  in  the  memory  graph  induced  by  S;  thus,  even 
if  I  ^  dom(dec(/,  cr)),  the  state  S'  is  count-correct.  Since  dom(cT)  D  dom(dec(/,  a)),  each  of  the 
properties  3?2-3?5  foUow  directly  from  the  hypothesis.  Thus,  3?(5')- 

D2  Suppose  ^{l,fp,a)  and  refcount(/,  cr)  ^  1,  and  let  S'  —  {I,  p,p,dec{l,a)).  By  hypothesis,  it 
follows  that  refcount(/,  ct)  >  1  since  I  is  in  the  root  set.  Thus,  refcount(/,  dec(/,  cr))  >  1  and 
hence  S'  is  count-correct,  satisfying  5?1.  Since  dom(<T)  =  dom(dec(/,  a)),  each  of  the  properties 
3fJ2-K5  foUow  directly  from  the  hypothesis.  Thus,  3}(S'). 

This  completes  the  verification  of  each  part.  ■ 

Proposition  18  Law  DS  holds;  more  generally, 

1.  If  ^{1,1,  p,  a),  then  p,dec-ptrs{l,(r)). 

2.  IfiP{f,p,p,a),  then  5?(r,  p,  dec-ptrs-env(p,  cr)). 

Proof:  By  induction  on  the  total  number  of  calls  to  dec-ptrs  and  dec-ptrs-env.  In  the  basis,  suppose 
the  number  of  calls  is  one;  there  are  two  cases: 

1.  dec-ptrs  is  called.  Then  there  are  three  subcases: 

(a)  a{l)  =  n,  true,  or  false.  Then  dec-ptrs(/,  cr)  =  dec(/,cr).  By  Dl,  3?(/,  p,  dec-ptrs(/,  cr)). 

(b)  cr(/)  =  susp(/'),  thunk(M,  p),  or  closure(Ax.  M,  p),  and  refcount(/,  cr)  >  1.  Then 
dec-ptrs(/,  cr)  =  dec(f,  cr),  and  hence  by  D2, 3?{r,p,  dec-ptrs(/,  cr)). 

(c)  cr(f)  =  rec(/', /)  or  recclosure(Ax.  Ajp),  and  refcount(/,  a)  >  2.  Then  dec-ptrs(f,  cr)  = 
dec(/,cr),  and  hence  by  D2,  U{l,p,dec-ptrs{l,a)). 

2.  dec-ptrs-env  is  called.  Then  since  dec-ptrs  is  not  called,  dom(p)  must  be  the  empty  set.  Thus, 
dec-ptrs-env(p,  cr)  =  cr  and  hence  5R(r,p,  dec-ptrs-env(p,  a)). 

For  the  induction  hypothesis,  suppose  the  total  number  of  calls  to  dec-ptrs  and  dec-ptrs-env  is 
greater  than  one.  There  are  again  two  main  cases: 

1.  dec-ptrs  is  called.  There  are  five  subcases  depending  on  the  reference  count  and  the  value 
stored  at  1. 

(a)  cr(/)  =  susp(/')  and  refcount(/,  cr)  =  1.  Then  dec-ptrs(/,  cr)  =  dec-ptrs(/',  dec(/,  cr)). 
By  A2,  3?(/', /,  p,  dec(/,cr))  and  so  by  induction,  3?(/,p,  dec-ptrs(/',  dec(/,  cr))).  Thus, 
3?(r,  p,  dec-ptrs(/,cr)). 
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(b)  a{l)  =  thunl<(M,  p),  refcount(/,  cr)  ==  1.  Then  dec-ptrs(/,  cr)  =  dec-ptrs-env(p,  dec(/,  cr)). 
By  Al,  R(l, p,p,dec{l,a))  and  so  by  induction,  p,  dec-ptrs-env(p,  dec(/,  cr))).  Thus, 

p,  dec-ptrs(/,cr)). 

(c)  cr(/)  =  closure(Ax.  M ,  p)  and  refcount(Z,  cr)  =  1.  Similar  to  the  previous  case. 

(d)  a{l)  =  recclosure(Aa;.  A,p[/  /']),  cr{l')  =  rec(Z,/),  refcount(Z,  cr)  =  2,  and 

refcount(/',  cr)  =  1.  Then  dec-ptrs(Z,  cr)  =  dec-ptrs-env(p,  dec(/',  dec(/,  dec(/,  cr)))).  Let 
(To  =  dec(/',  dec(Z,  dec(/,  cr)));  then  the  state  S  =  (/,p,  p,  cro)  is  count-correct,  since  both 
I  and  I'  have  disappeared  from  the  memory  graph.  Also,  S  satisfies  properties  5R2-5R5, 
since  dom(c7o)  C  dom(cT).  Thus,  3?(5'),  and  hence  by  induction  R(l,  p,  dec-ptrs-env(p,  (Jq)). 
Thus,  3f?(f, p,  dec-ptrs(Z,cr)). 

(e)  ct(/)  =  rec(r,/),  cr(/')  =  recclosure(A2;.  A',p[/  /]),  refcount(/,  cr)  =  2,  and 

refcount(/',  cr)  =  1.  Then  dec-ptrs(/,  cr)  =  dec-ptrs-env(p,  dec(/',  dec(/,  dec(/,  a)))).  Similar 
to  the  previous  case. 

2.  dec-ptrs-env  is  called.  Since  the  number  of  calls  is  greater  than  one,  dom(p)  =  {xi, . . .  ,Xn} 
for  n  >  0.  Since  K(p(a:i),. .  .,p(xn),l,p,o')  and 

dec-ptrs-env(p,  cr)  =  dec-ptrs(p(a;„),  dec-ptrs(. . .  dec-ptrs(p(a:i),  cr) . . .)) 
it  follows  from  repeated  applications  of  the  induction  hypothesis  that  p,  dec-ptrs(p,  cr)). 

This  completes  the  induction  hypothesis  and  hence  the  proof.  ■ 

Proposition  19  Each  of  the  laws  II,  12,  E,  Nl,  N2,  N3,  Ul,  and  U2  in  Section  4  hold. 

Proof:  We  verify  each  law  individually. 

11  Suppose  iR.(l,p,a)  and  I  €  dom(cr).  Let  S'  =  (/, /,  p,  inc(/,  cr)).  Since  there  is  one  more 
pointer  to  /  in  the  root  set  of  S'  and  the  reference  count  has  been  incremented.  S'  is  count- 
correct.  Since  dom(cr)  =  dom(inc(Z,  cr)),  each  of  the  properties  5ft2-cR5  follow  directly  from  the 
hypothesis.  Thus,  5?(5'). 

12  Suppose  R.(J,p,a)  and  p{x)  G  dom(cr)  for  all  x  G  dom(p).  Then  3?(r,  p,  p,  inc-env(p,  cr))  follows 
by  an  easy  induction  on  the  size  of  dom(p)  using  an  arguments  similar  to  the  last  case. 

E  Suppose  R{l,l,  p,p,a)  and  x  ^  dom(p),  and  let  S'  =  iJ-iP[x  ^  /],p,  cr).  Then  the  root  set 
points  of  S  and  S'  are  identical,  and  the  memory  graph  induced  by  S  and  S'  are  hence 
identical.  Thus,  §?(5').  The  converse  is  similar  and  omitted. 

Nl  Suppose  R{l,p,a)  and  {l',cr')  =  new(c,  cr)  for  some  constant  c,  and  let  S'  =  {l',l,p,a').  Since 
new  is  an  allocation  relation,  refcount(Z',  cr)  =  0,  refcount(/',  cr')  =  1,  and  for  any  location 
I  7^  /',  a{l)  =  a{l')  and  refcount(/,  cr)  =  refcount(/,  cr').  First,  note  that  S'  is  count-correct, 
since  the  only  location  in  cr'  that  is  different  from  cr  is  /',  and  that  location  has  a  pointer 
in  the  root  set.  This  verifies  property  5?1.  Since  dom(cr)  is  finite,  dom(cr')  is  also  finite  and 
so  property  3f?2  holds  of  S.  Finally,  since  new  does  not  create  any  additional  cycles  in  the 
memory  graph  or  thunks  or  closures,  properties  5J3-5R5  hold  in  S' .  Thus,  1R(5'). 
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N2  Suppose  a),  (l',cr')  =  new(closure(iV,  p),  ct)  or  new(thunk(iV,  p),  cr),  FV(N)  =  dom(p), 

and  N  is  typeable,  and  let  S'  =  (I',  I,  p,  <r').  Since  new  is  an  allocation  relation,  refcount(/',  cr)  = 
0,  refcount(/',  ct')  =  1,  and  for  any  location  /  ^  I',  cr{l)  =  cr(/')  and  refcount(/,  a)  = 
refcount(/,  cr').  To  see  that  property  3?1— namely  count-correctness — holds  of  5',  note  that  all 
of  the  pointers  from  p  are  accounted  for  in  the  closure  or  thunk  stored  in  and  that  /'  only 
has  reference  count  1.  To  see  dom(cr')  =  dom(cr)  U  {/'}  is  finite  because  dom(cr)  is.  If  /' 
is  a  thunk,  then  refcount(/',  ct')  =  1,  which  together  with  the  hypothesis  guarantees  property 
9?3.  No  cycles  are  created  in  the  induced  memory  graph  by  new,  so  3?4  holds.  Finally,  3?5 
holds  by  hypothesis.  Thus,  3?(6''). 

N3  Suppose  p,  cr)  and  (/',cr')  =  new(susp(/),  cr)  or  new(rec(l, /),  a).  Then  3?(/', /,  p,  cr')  fol¬ 

lows  in  a  manner  similar  to  the  previous  case. 

U1  Suppose  S  =  {l,p,(j)  and  3?(5),  a{l)  is  a  constant.  We  prove  the  first  statement  of  U1  only; 
the  first  follows  similarly.  So  suppose  /'  G  dom(cr),  and  I  is  not  reachable  from  /'  in  the 
memory  graph  induced  by  S,  and  let  S'  =  (/,p,  inc(/',  cr[Z  susp(/')])).  In  S'  the  in-degree  of 
/'  is  now  one  greater  than  in  S;  the  in-degree  of  all  other  nodes  remains  the  same.  Thus,  S' 
satisfies  property  3il.  Since  dom(cr)  =  dom(cr'),  the  domain  of  cr'  is  finite,  satisfying  property 
3?2.  No  new  thunks  are  created,  so  property  3?3  holds  of  S'.  Since  /  is  not  reachable  from  I' 
in  S,  there  is  no  cycle  through  I  in  S'.  Thus,  S'  satisfies  property  3J4.  Finally,  property  K5 
holds  since  no  thunks  or  closures  are  added  to  cr.  Thus,  3?(5'). 

U2  Suppose  S  =  (/,[,  p,  cr)  and  5?(5),  refcount(/,  cr)  7^  1,  a{l)  =  susp(/'),  and  cr(/')  =  thunk(iV,  p), 
and  let  S'  =  (p,/,  p,  dec(/',  dec(/,cr[/  1-+  c]))).  To  verify  property  3^1,  note  first  that 
refcount(/',  cr)  =  1  by  hypothesis.  Thus,  since  the  pointers  from  al'  are  mentioned  in  the 
root  set  of  S',  it  follows  that  S'  is  count-correct.  It  is  also  clear  that  each  of  the  properties 
3f?2-SR5  hold  of  S'.  Thus,  3?(5'). 

This  completes  the  verification  of  each  part.  ■ 

Proof  of  Lemma  10 

Lemma  10  Suppose  (/',p/,p',cr/)  and  {l",pg,p",(rg)  are  congruent.  If  interp  j{M,pf,  a  j)  = 
then  interp^(M,  p^,  cr^)  =  {l'g,a'g)  and  the  resultant  states  p',a'j)  and  ,  p" ,o'g) 

are  congruent. 

Proof:  By  induction  on  the  number  of  calls  to  interp.  We  cover  the  four  cases  in  the  core  language 
and  leave  the  other  cases  to  the  reader.  To  make  the  cases  easier  to  read,  let  h  be  the  isomorphism 
from  Q{(rf)  to  Q(crg)  that  makes  the  above  states  congruent. 

1.  M  =  X.  Then  interpy(M,  p/,cr/)  =  (p/(x),<7/).  Then  also  interp^(M,  Pg,  cr^)  =  (pg(a;),  cr^), 
and  the  resultant  states  {pf{x),U, p',a'j)  and  (pg(x),  1", p", a' )  are  congruent  via  h. 

2.  M  =  {Xx.P).  Then  interp^(M,p/,cr/)  =  new(c!osure(Ax.  P,p/),  cry)  =  Since  /  is 

an  allocation  relation, 

•  ly  ^  dom(cry)  and  dom(cry)  =  dom(cry)  U  {/y}; 

•  for  aU  locations  I  G  dom(cry),  cry(/)  =  cry(/)  and  refcount(/,  cry)  =  refcount(/',  cr));  and 
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•  =  closure(Ax.  P,p/)  and  refcount(/y,  (Ty)  =  1. 

Note  that  interp^(M,/9g,  cr^)  =  new(closure( Ax.  P,  cr^)  =  Again,  since  g  is  an 

allocation  relation, 


•  I'g  ^  dom(cT5)  and  dom(CT^)  =  dom(<T5)  U  {/'}; 

•  for  aU  locations  I  £  dom(CT5),  ag{l)  -  a'g{l)  and  refcount(/,  cr^)  =  refcount(/',  cr^);  and 

•  a'g{l'g)  =  closure(Ax.  P,Pg)  and  refcount(/' ,  a' )  =  1. 

Let  h'  =  h[l'j  I'g].  It  is  clear  that  h'  is  an  isomorphism  from  ^(<7^)  to  ^(cr').  Now 
consider  the  resultant  states  and  {l'g,l",p",a'g).  Using  the  isomorphism  h,  the 

first  two  conditions  for  congruence  of  states  are  satisfied,  and  so  we  just  need  to  show  that 
the  last  six  properties,  stating  the  relationship  between  the  values  stored  at  locations,  is 
satisfied.  But  the  contents  of  the  cells  in  aj  and  Og  do  not  change,  and  for  the  new  locations, 
=  closure(Ax.iV,p/),  a'g{h'{l'j))  =  =  c!osure(Ax.  iV,  p^),  dom(p/)  =  dom(p5), 

and  for  all  x  G  dom(p/),  P3(x)  =  h'{pf{x));  the  last  two  facts  Mow  from  the  hypothesis. 
Thus,  the  resultant  states  are  congruent. 


3.  M  =  (P  Q).  Since  interpy(M,p/,<T/)  = 

•  interp_f(P,p/ I  P,  a-/)  =  (//,o,a/,o); 

•  interp_^((5,p/|<3,cT/,o)  =  {I j a j ,i)-, 

•  =  closure(Ax.  iV,py)  or  recclosure(Ax.  iV,py). 

By  hypothesis  the  environments  pf  and  pg  have  the  same  domain,  can  also  be  divided  into 
Pg  I  P  and  pg\Q-  By  two  applications  of  the  induction  hypothesis, 


•  interp^(P,P5  |  P,  cr^)  =  (/p,o,<7g,o)  and 

•  int  exp  g{Q,pg\Q,agfi)  =  (/g,i,  C7g,i), 

and  the  states  (//p, //p,  P, p',  a/p)  and  {lg,Q,lg,i,l",p",crg,i)  are  congruent.  In  particular,  note 
that  crgfi{lgfi)  =  closure(Ax.  iV,p')  or  recclosure(Ax.  N,p'g).  There  are  now  two  cases: 


(a)  refcount(//,o,  cT/p)  =  1.  Then  refcount(/p,o,  a^p)  is  also  1,  since  the_  two  refer¬ 
ence  counts  must  be  the  same.  Because  the  states  {l',p'j[x  i-*-  //p],p',  cr/p)  and 
(/",  p^[x  t->  /pp],  p",  o-^p)  are  congruent,  it  Mows  from  the  induction  hypothesis  that 
interp^(iV,p^[x  f^p],  dec(/o,  <7pp))  =  {Ig^cTg)  and  [I'jJ' ,  p' ,a'j)  and  {I'g,!',  p',a'g)  are 
congruent.  Putting  the  pieces  together,  we  also  see  that  interp^(M,  Pg,  cr^)  =  [I'g.cr'g) 
as  desired. 


(b)  refcount(//p,  (T/p)  ^  1.  Then  refcount(fgp,  cr^p)  ^  1  also,  since  the  two  reference  counts 
must  be  the  same.  Since  {!,p'j[x  i-»-  I/p],p',<7yp)  and  (/"p^[x  /gp],  p",  cr^p)  are  con¬ 

gruent,  by  induction  interpg(iV,  Pg[x  /gp],  inc-env(pg,  dec(/gp,  cTgp)))  =  {I'g.cr'g)  and 
{l'j,!,p',a'j)  and  {I'g,!' ,  p"  ,a'g)  are  congruent.  Putting  the  pieces  together,  we  also  see 
that  interpg(M,pg,crg)  =  as  desired. 
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4.  M  =  (fetch  P).  Then  interp^(P,p/,  cry)  =  (//,o,  <r/,o)-  By  induction,  interpg(P,  cr^)  = 
(IgfiiCTgfi)  and  the  states  and  p",(Tgfi)  are  congruent.  Now  there  are 

two  main  cases;  either  crjflilfp)  =  susp(//,i)  or  (Tf,o{lj,o)  =  rec(//,i, z).  We  leave  the  second 
case  to  the  reader  since  it  is  relatively  straightforward  and  consider  only  the  first  case. 

Suppose  <Jf,o(lf,o)  =  susp(//,i).  By  the  definition  of  congruence,  (Tgfi{lgfi)  =  susp(/pj).  Now 
there  are  two  subcases  depending  on  the  object  held  at  Ijy. 

(a)  CTf,o{lf,i)  =  thunk(P,p'^).  Then  by  congruence,  c7g_o(^5,i)  =  thunk(P,p^).  There  are  two 
subcases  depending  on  the  reference  count  of  l/fii 

i.  refcount(/y,0)  o'/,o)  =  1-  Since  the  above  tuples  are  congruent,  refcount(/5,07  crp.o)  =  1- 
Note  that  the  states 

(/', p'^, p',  dec(//,i,  dec(/y,oo-/,o, ))) 

(/", p' , p",  dec(/5,i,  dec(/5,o^^5,o> ))) 

are  congruent  since  p'j  and  p'  must  have  the  same  domain  and  must  match 
via  the  multigraph  isomorphism  h  on  their  domains.  Thus,  by  induction, 
interp^(P,p(,,dec(/5,i,dec(/5,of^g,o,)))  =  and  the  states  , p> and 

p",ag)  are  congruent.  Putting  all  the  steps  together,  we  also  see  that 
interp^(M,P5,<T5)  = 

ii.  refcount(//,i,cry^i)  0.  Similar  to  the  previous  case. 

(b)  cTy  o(Z/,i)  7^  thunk(P,  py).  Then  again  there  are  two  cases  depending  on  the  reference 
count  of  Iffi-. 

i.  refcount(//,o,  o-/,o)  =  1;  then  refcount(/j,,0)  cr^.o)  =  1-  Thus, 

intBTTPg{M,Pg,ag)  =  {lgj,dec{lgfi,crgfl)) 

and  the  states  (Z/j,P,p',  dec(Z/,07  0-/,o))  and  {lg^i,l",  p",dec{lg^o,crgfi))  are  congruent. 

ii.  refcount(//,07<7/,o)  ^  1.  Similar  to  the  previous  case. 

This  completes  the  induction  and  hence  the  proof.  ■ 

Proof  of  Theorem  13 

Recall  from  Section  5  that,  in  order  to  prove  a  correctness  theorem,  we  needed  a  definition  of  how 
to  unwind  a  term  from  a  store.  The  definition  of  two  mutuaUy-recursive  functions  for  performing 
this  task,  valof  and  valofcell,  appears  in  Table  8.  It  is  obvious  from  the  definitions  that  only  the 
reachable  cells  affect  the  value  returned  by  valof  and  valofcell.  For  instance,  if  I'  is  not  reachable 
from  I  in  store  a  and  a'  =  dec(Z',  cr),  then  valofcell(Z,  <7)  =  valofcell(Z,  cr').  We  will  use  this  fact 
throughout  the  arguments  that  follow. 

Also  essential  to  the  proof  of  Theorem  13  is  a  notion  of  when  one  term  is  ‘more  evaluated’  than 
another.  Section  5  defines  a  relation  >*  between  terms  which  expresses  this  relationship.  We  can 
prove  three  lemmas  about  the  relationship  of  >  and  canonical  forms. 

Lemma  20  If  c  >  P  and  c  is  a  canonical  form,  then  P  is  a  canonical  form.  Moreover,  c  and  P 
have  the  same  shape,  i.e.,  if  c  is  a  numeral  or  boolean,  then  c  =  P;  if  c  =  Ax.  Q,  then  P  =  Ax.  Q' ; 
and  if  c  =  (store  Q),  then  P  =  (store  Q'). 
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Table  8:  Definitions  of  valof  and  valofcell. 


valof(a;,  p,  a) 
valof(Aa:.  P,p,a) 
valof((P  Q),p,a) 
valof((fetch  P),p,a) 
valof((share  x,y  as  P  in  Q),  p,  <t) 

valof((dispose  P  before  Q),p,a) 
valof(7i,  p,  a) 
valof(true,  p,  cr) 
valof( false,  p,  a) 
valof((succ  P),p,a) 
valof((pred  P),p,a) 
valof((zero?  P),p,<7) 
valof((fix  P),p,  a) 
valof((if  N  then  P  else  Q),  p,  a) 


valofcell(/)(x),  a) 

Aa;.  valof(M,  p,  (t),  ar  ^  dom(p) 

(valof(P,p,CT)  valof((5,p,o-)) 

(fetch  valof(P,p,  cr)) 

(share  x,  y  as  valof(P,  p,  cr)  in  valof(Q,  p,  cr)), 
where  x,y  ^  dom(p) 

(dispose  valof(P, p,  cr)  before  valof((3,  p,  cr)) 
n 

true 

false 

(succ  valof(P,p,cr)) 

(pred  valof(P,  p,  ct)) 

(zero?  valof(P,  p,  cr)) 

(fix  valof(P,p,cr)) 

if  valof(iV,p,  cr)  then  valof(P,  p,  cr)  else  valof((5,p,  cr) 


valof((store  N  where  Xi  ~  Mi, . .  .,Xn  =  Mn),p,(T) 

=  (store  valof(iV, p,  cr)  where  Xi  =  valof(Mi,p, cr), . .  .,a:„  =  valof(Mr,,p,  cr)),  x,-  ^  dom(p) 


valofcell(/,  cr)  =  < 


n 

true 

false 

Ax.  valof(M,  p,  cr) 


(store  valofcell(Z',  cr)) 
valof(M,  p,  cr) 

valof((fix  (store  (A/.  Ax.  M))),p,(t) 


if  cr(/)  =  n 
if  cr(Z)  =  true 
if  a{l)  =  false 

if  cr(Z)  =  closure(Ax.  M,p)  or 
cr(/)  =  recclosure(Ax.  M,p)  and 
X  ^  dom(p) 
if  cr(/)  =  susp(/') 
if  cr(Z)  =  thunk(M,  p) 
if  a\l)  =  rec(Z',/), 

cr(/')  =  recclosure(Ax.  M,  p[/ i->  Z]), 
X,/  0  dom(p) 


Proofs  of  the  Main  Theorems 


43 


Proof:  There  are  two  cases  to  consider:  either  c  -D.  P,  or  c  =  C[M],  P  =  C[iV’],  C[-]  is  nontrivial, 
and  M  JJ.  iV.  In  the  first  case,  since  c  is  canonical,  c  =  P,  and  hence  P  is  canonical.  In  the  second 
case,  for  c  to  be  canonical  it  must  be  the  case  that  C[-]  =  n,  true,  false.  Ax.  £)[•],  or  (store  C[-]). 
Thus,  P  must  be  canonical  as  well,  and  must  have  the  same  shape  as  c.  ■ 

Lemma  21  If  c  is  a  canonical  form  and  M  >  c,  then  M  d  >  c. 

Proof:  By  the  definition  of  M  >  c,  we  know  that  M  =  C{M'\,  c  =  C[d],  and  M'  JJ.  d.  In  order  for 
c  to  be  canonical,  it  must  be  the  case  that  either  C[-]  =  [•],  n,  true,  false.  Ax.  P[-],  or  (store  P>[-]). 
In  the  first  case,  M'  =  M  and  d  =  c,  so  M  1].  c  >  c.  For  the  other  cases,  M  JJ.  M  >  c.  ■ 

Lemma  22  If  c  is  a  canonical  form  and  M  >*  c,  then  M  JJ-  d  >*  c. 

Proof:  An  easy  induction  on  the  length  of  M  =  Mi  >  . . .  >  Mk  >  c  using  Lemma  20.  ■ 

We  need  a  similar  definition  of  one  state  in  the  reference- counting  interpreter  being  ‘more 
evaluated’  than  another.  Basically,  one  state  is  more  evaluated  than  another  if,  tracing  from  the 
root  set,  the  storable  objects  held  at  nodes  are  identical  or  thunks  have  been  replaced  by  more 
evaluated  forms.  Formally, 

Definition  23  We  say  {I,p,cr)  >*  (!,p,a')  if  for  aU  I  reachable  from  the  root  set,  /  G  dom((7)  D 
dom(cr')  and 

1.  17(1)  =  n,  true,  false,  closure(Ax.  Ajp),  or  recclosure(Ax.  A,/?),  and  a(l)  =  cr'(l)  and  (p,(7)  >* 

2.  a(l)  =  susp(/o)  or  rec(/o,/),  cr(/o)  is  not  a  thunk,  cr'(l)  =  susp(/o)  and  {lo,cr)  >*  (lo,(^')',  or 

3.  a(l)  =  susp(/o),  cr(lo)  =  thunk(P,p)  and  either 

(a)  a'(l)  =  susp(/o),  <^'(lo)  =  thunk(P,p),  and  (/?,cr)  >*  (p,  cr');  or 

(b)  a'(l)  =  susp(/'),  cr'(/')  is  not  a  thunk,  interp(P,  p,  cr)  =  (I',  a"),  and  (I',  o'')  >*  (I',  o') 
where  (p,cr)  >*  (p,o')  if  for  every  x  G  dom(p),  (p(x),cr)  >*  (p(x),cr'). 

It  is  not  difficult  to  prove  that  >*  is  reflexive  and  transitive  on  states.  It  is  also  not  difficult  to 
prove  the  following  two  lemmas: 

Lemma  24  Suppose  ^(l,p,p,o)  and  interp(M,  p,  cr)  =  (I',  o').  Then  {l,p,o)  >*  {l,p,o'). 

Lemma  25  If  Q'  >*  \ia\of(Q,  p,o)  and  (l',p,p',o)  >*  (P,  p,  p',o'),  then  Q'  >*  \ia\of(Q,p,o'). 

The  proof  of  the  first  is  an  easy  induction  on  the  number  of  calls  to  interp;  the  proof  of  the  second 
is  an  easy  induction  on  the  definition  of  valof. 

We  now  have  enough  machinery  to  prove  the  main  correctness  theorem. 

Theorem  13  Suppose  M  is  typeable,  dom(p)  =  FV(M),  M'  is  closed,  and  M'  >*  valof(M,  p,  ct). 
Suppose  also  that  3?(/',  p,  p',  ct). 

1.  If  M'  JJ.  c,  then  interp(M,  p,  cr)  =  (I',  o')  and  c  >*  valofcell(/',  cr'). 
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2.  If  interp(M,  p,  cr)  =  then  M'  valofcell(/',  cr'). 

Proof;  The  first  part  is  proven  by  induction  on  the  height  of  the  proof  of  M'  JJ.  c.  We  consider 
the  cases  for  the  core  language  and  leave  the  cases  for  the  PCF  extensions  to  the  reader.  To  ease 
the  readability  of  the  various  cases,  we  can  separate  each  induction  case  into  two  cases  based  on 
whether  or  not  M  is  a  variable  or  a  canonical  form.  The  first  of  these  cases  can  be  seen  immediately. 
If  M  is  a  canonical  form  or  variable,  then  the  form  of  the  rules  guarantees  that  interp(M,  p,  cr) 
returns  a  result  {I',  a')  and  valofcell(Z',  cr')  =  valof(M,p,  a).  Thus,  by  Lemma  22,  it  follows  that  c  >* 
valofcell(/',  cr').  For  instance,  if  M  =  (Xx.P),  then  interp(M,  p,  a)  =  new(closure(Aa:.  P,  p),  cr)  = 
Since  51i(P,p,p',  c)  and  V  ^  dom((T),  the  new  cell  /'  in  a'  cannot  be  reached  from  a.  Thus, 

vaiofcell(Z',  cr')  =  valof(Ax.  P,p,  cr')  =  valof(Aa:.  P,  p,  cr) 

as  desired. 

If,  on  the  other  hand,  M  is  not  a  variable  or  canonical  form,  then  there  is  some  interpretation 
required  in  the  reference- counting  interpreter.  Now  we  divide  into  cases  depending  on  the  last  rule 
used  in  the  proof  of  M'  !]•  c. 

1.  M'  —  (P'  Q'),  where  P'  JJ.  (Ax.  N'),  Q'  -(I  d,  and  N'[x  :=  d]  1].  c.  The  only  case  to  consider 
is  M  =  (P  Q),  where  P'  >*  valof(P,p,  cr)  and  Q'  >*  valof((5,p,  cr).  Since  M  is  typeable,  the 
free  variables  of  P  and  Q  are  disjoint.  The  first  step  is  to  evaluate  the  operator  and  operand. 
By  induction, 

interp(P,p|P,cr)  =  {lo,ao) 

and  (Ax.  N')  >*  valofcell(/o,  ctq).  We  need  to  show  that  Iq  really  holds  a  closure.  By  Lemma  20, 
(Ax.  iV')  and  valofcell(Zo,  cro)  must  have  the  same  shape.  Since  K(Z',p  |  P, p  |  Q,p',  cr),  by  The¬ 
orem  6  lR(/o,  F, p  I  <5,  p',  ctq)  and  so  cro(/o)  cannot  be  a  thunk.  Thus,  the  only  possibility  left  is 
that 

(7g(/g)  =  closure(Ax.  iV,p')  or  recclosure(Ax.  iV,  p'). 

Next  we  need  to  evaluate  the  operand.  By  Lemma  24  we  know  {l',p\Q,p',a)  >* 
{l',p  \  Q,p',  CTg).  Since  Q'  >*  valof(<3, p  |  <5,  cr),  by  Lemma  25  we  have  Q'  >*  valof(Q,p  |  Q,  erg). 
By  the  induction  hypothesis, 

interp((5,p|(5,c7o)  =  (Zi,fri) 

where  d  >*  valofcell(Zi,  ctj).  Since  interp((5,p  |Q,crg)  =  (Zi,cri),  it  follows  from  Lemma  24 
that  p',ao)  >*  (/q,  F,  p',  cri);  thus, 

cri(Zg)  =  closure(Ax.  iV,p')  or  recclosure(Ax.  iV,  p'). 

To  evaluate  the  application,  there  are  two  subcases:  either  refcount(/g,  erj)  =  1  or 

refcount(/g,  cTj)  >  1.  We  do  the  first  case  and  leave  the  other  case  to  the  reader.  If 
refcount(/g,cri)  =  1,  then  R{l',p'[x  /i],p',dec(/o,c7i))  by  laws  A2  and  E.  It  follows  from 

{l',p',ao)  >•  {l',p',ax)  >*  (f',p',dec(Zg,(7i)) 

and  Lemma  25  that  N'[x  :=  d]  >*  valof(iV,p'[x  Zi],  dec(/g,  cri)).  Thus,  by  induction, 

interp(iV,p'[x  Zi],  dec(Zo,  erj))  =  (Z',  cr') 

where  c  >*  valofcelI(Z',  cr').  This  shows  that  interp(M,  p,  cr)  =  (Z',  cr')  and  c  >*  valofcell(Z',  cr'). 
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2.  M'  =  (store  N'  where  xi  =  M{,...,Xn  —  c  =  (store  . .  .,Xn  ■=  di, . .  .,c?„]),  and 

M-  JJ-  di-  We  only  need  to  consider  the  case  when  M  =  (store  N  where  xi  =  Mi, . .  .,x„  = 
M„),  where  N'  >*  valof(iV,0,  cr)  and  M-  >*  valof( M, •,/?  |  Mi,  cr).  Since  M  is  typeable,  the  free 
variables  of  each  Mi  are  disjoint.  Since  M{  >*  valof(Mi,p  \Mi,a),  by  induction 

interp(Mi,pi,(T)  =  (/i,o-i), 


where  c?i  >*  valofcell(/i,  cti).  By  Lemma  24, 

(r',p|M2,...,p|M„,cr)  >•  (P,p1M2,...,p|M„,c7i) 

and  by  Theorem  6,  3f?(/i,  P, p  |  M2, . . . ,p  1  M„,  cti).  Since  M^  >*  valof(M2,p  |  M2,  a),  by 
Lemma  25,  M^  >*  valof(M2,  p  |  M2,  cti).  Using  similar  repeated  applications  of  the  induc¬ 
tion  hypothesis, 

interp(M,',pi,cr,_i)  =  {h,ai) 
where  di  >*  valofcell(/,',  Uj),  and  by  Lemma  24, 

(/i, . . . ,  /,_i,  P,p',  CTi_i)  >•  (/i, . . . ,  U-iJ',  p\  Oi) 


Finally,  let 

p'  =  0[xi,...,Xn 

new(thunk(iV,p'),(T„)  =  (/n-i-i,  c^n-i-i) 
new(susp(/„+i),£rn+i)  =  (P,*?') 

Then  using  Lemma  25,  we  find  that  (store  iV'[xi , . . . ,  x„  :=  di,...,d„])  >*  valofcell(/',  cr')  as 
desired. 

3.  M'  =  (fetch  N'),  where  N'  JJ.  (store  Q')  and  Q'  JJ-  c.  Then  the  only  case  to  consider  is 
M  =  (fetch  N)  where  N'  >*  valof(iV,p,(7).  By  induction, 

interp(iV,  p,  a)  =  (Jo,  cro) 

where  (store  Q')  >*  valofcell(/o,  t^o)-  By  Theorem  6,  U{loJ',p',cro)-  Since  (store  Q')  >* 
valofcell(/o,  o'o)  and  (To(Jo)  must  be  a  value,  it  follows  from  Lemma  20  that  cro(lo)  =  susp(Ji) 
or  rec(/i,/)  and  Q'  >*  valofceII(/i, cto).  We  consider  only  the  case  when  CTo(/o)  is  susp(/i)  and 
leave  the  other  case  to  the  reader.  There  are  two  subcases: 

(a)  cro(/i)  =  thunk(i2,p').  There  are  two  subcases: 

i.  refcount(/o,  (To)  =  1.  First,  note  that  neither  Iq  nor  li  is  reachable  from  p' — if  either 
were,  the  state  S  =  (lo,I',p',cro)  would  have  a  cycle  that  was  not  composed  solely 
of  a  rec  and  a  recclosure — and  this  contradicts  the  regularity  of  the  state  S.  Thus, 

Q'  >*  valof(i2,p',  dec(/i,dec(/o,cro))). 

By  laws  A1  and  A2,  3?(/',p',p',  dec(/i,dec(Jo,  o-q))).  Thus,  it  follows  by  induction 
that  interp(i2,p',  dec(/i,dec(/o,o'o)))  =  (P,  o'')  and  c  >*  valofcell(r,  cr')  as  desired. 
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ii.  refcount(/o,  i7o)  >  1.  First,  note  that  neither  /q  nor  /i  is  not  reachable  from  p' — 
if  either  were,  there  would  be  an  illegal  cycle  in  the  memory  graph  induced  by 
S  =  Thus,  Q'  >*  valof(i2,/)',  dec(/i,dec(/o,o-o[/o  0])))  still  holds.  By 

law  U2,  p',dec(li,dec(/o,<To[/o  0]))).  Thus,  it  follows  by  induction  that 

interp(iE,p',dec(/i,dec(/o,cro[/o  i->-  0])))  = 

and  c  >*  valofcell(/2,cri).  Since  /q  is  not  reachable  from  p'  in  dec(/o,  cro[/o  0]), 
by  Theorem  6  it  follows  that  /q  is  not  reachable  from  I2  in  Thus,  c  >* 

valofcell(Z2,  inc(/2,  o-i[/o  susp(/2)]))  as  desired. 

(b)  cro(Zi)  7^  thunk(J?,p').  Then  valofcelI(/i,  do)  is  a  value.  There  are  two  subcases: 

i.  refcount(/o,  CTo)  =  1.  Since  Q'  >*  valofce!I(Zi,  dec(/o,  ao)),  it  follows  by  Lemma  22 
that  c  >*  valofcell(Zi, dec(/o, fTo)). 

ii.  refcount(/o,  (To)  >  1.  Similar  to  the  previous  subcase  and  hence  omitted. 

4.  M'  =  (share  x,  y  as  P'  in  Q'),  where  P'  JJ.  d  and  Q'[x,  ?/  ;=  d]  U-  c.  Then  the  only  case  to  con¬ 
sider  IS  M  =  (share  x,y  as  P  in  Q),  where  P'  >*  valof(P,  p  |  P,  a)  and  Q'  >*  \/a\of{Q,p\Q,a). 
Since  M  is  typeable,  the  free  variables  of  P  and  Q  are  disjoint.  Since  P'  >*  valof(P,p  |  P,  cr), 
it  follows  by  induction  that 

int6rp(P,p|P,cr)  =  {lo,(7o), 

where  d  >*  valofcell(Zo,  o-q).  By  Lemmas  24  and  25,  Q'  >*  vaiof(Q,p  |(5,ao),  and  by  Theo¬ 
rem  6,  (/',p  \  Q,p',ao).  By  laws  II  and  E, 

(p  I  (5)[a:,  y  /q],  p',  inc(/o,  o-q)). 

Since  Q'[x,y  :=  d]  >*  valof((3,  (p  |  Q)[ar,  t/ 1-+  Zo]pnc(Zo,  c^o)),  it  follows  by  induction  that 

interp((3,p  |(5[x,2/ Zq],  inc(/o,  cto))  =  {I',  a') 
and  c  >*  valofcell(Z',  (t')  as  desired. 

5.  M'  =  (dispose  P'  before  Q'),  where  Q'  1).  c.  This  case  is  similar  to  the  previous  case  and  hence 
omitted. 

This  completes  the  proof  of  the  first  part.  The  second  part  is  proven  by  induction  on  the  number 
of  calls  to  interp.  We  consider  the  cases  for  the  core  of  the  language  and  leave  the  cases  for  the 
PCF  extensions  to  the  reader. 

1.  M  =  X.  Then  interp(M,p,<7)  =  {p{x),a)  =  (Z',(t').  Note  that  valofcell(Z', n')  = 
valof(M,p,cr),  and  hence  M'  >*  valofcell(Z',(T').-  Since  3fJ(Z', p,  p',  tj),  cr'(Z')  =  (7{p{x))  must 
be  a  value,  and  hence  valofcell(Z',  cr')  =  d  where  d  is  a  canonical  form.  Thus,  by  Lemma  22, 
M'  Ij-  c  >*  d  =  valofceil(Z',  cr')  as  desired. 


2.  M  =  (Ai.  N).  Similar  to  the  previous  case. 
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3.  M  =  {P  Q).  Since  interp(M,p,  cr)  =  (/', c'),  it  follows  that 

interp(P,  p  |  P,  u)  =  (Zq,  cto) 
interp((5,p|(3,cro)  =  (Zi,ai) 

(7i(/o)  =  closure(Ax.  iV,p')  or  recclosure(Aa;.  iV,  p') 

Since  M'  >*  valof(M,p,  (t),  it  must  be  that  M'  =  (P'  Q')  for  some  closed  P'  and  Q',  where 
P'  >*  valof(P,p,(r)  and  Q'  >*  valof((5,p,(7).  By  induction, 

P'  JJ.  d'  >*  valofcell(/o,  cto) 

Q'  ^  d>*  valofcell(/i,  (Ti) 

By  Lemmas  24  and  25,  d'  >*  valofcell(Zo,  <7i).  Since  (Ti(Zo)  is  a  closure,  valofcell(Zo,  (Ji)  must 
be  a  A-abstraction,  and  so  by  Lemma  20  it  follows  that  d'  =  (Ax.  N')  for  some  N'.  If 
refcount(Zo,  CTi)  =  1,  then  A/’'[x  :=  d]  >*  valof(iV',p'[x  i->-  Zi],  dec(Zo,  (Ji)).  If,  on  the  other 
hand,  refcount(Zo,  <Ji)  >  1,  then  N'[x  :=  d]  >*  valof(iV',  p'[x  h-*-  Zi],  inc-env(p',  dec(Zo,  ui))).  In 
either  case,  it  follows  by  the  induction  hypothesis  that 

N'[x  :=  d]  4  c  >*  valofcell(Z',£r'). 

Thus,  we  conclude  M'  -(I-  c  >*  valofcell(Z',<j'). 

4.  M  =  (store  N  where  Xi  =  Mi, . . . ,  Xn  =  Mn)-  Since  M  evaluates, 

interp(Mi,plMi,<7)  =  (Zi,a-i) 
interp(M2,p|M2,cri)  =  (Z2,a2) 


interp(Mn,p|M„,CT„_i)  =  (Z„,cr„) 
p  =  0[xi, . . . ,  x„  H-*- Zi, . . . ,  Z,i] 

(Zn+i,cr„+i)  =  new(thunk(iV,p'),cr„) 

(Z',ct')  =  new(susp(Zn+i),cr„+i) 

Since  M'  >*  valof(M,  p,  a),  it  follows  that  M'  =  (store  N'  where  xi  =  M[, . . . ,  x„  =  M^)  and 
N'  >*  valof(iV,  0,  ct)  and  M-  >*  valof(M,-,p  [  Mi,  a).  By  induction.  Mi  JJ-  Ci  >*  valofcell(Zi,  cti). 
To  evaluate  the  next  term  in  the  sequence,  note  that 

Mj  >*  valof(M2,P2,<r)  >*  valof(M2, P2,  cri) 

so  by  induction  M^J  JJ-  C2  >*  valofcell(Z2,  (72)-  Extending  the  induction  hypothesis  further  yields 
that  Mi  JJ  Ci  >*  valofcell(Z,-,  (T,).  Note  also  that  by  Lemmas  24  and  25,  N'  >*  valof(iV,  0,  (T„); 
it  follows  that 

(store  N'[xi, . . . ,  x„  :=  ci, . .  .,c„])  =  c  >*  valof(iV, p',  On)  =  valofcell(Z',  a'). 

Thus  Mi  JJ  Ci  and  so  M'  JJ  c  >*  valofcell(Z',  <r')  as  desired. 

5.  M  =  (fetch  P).  Since  M  evaluates,  interp(P, p,  cr)  =  (Zo,cro)  and  by  Theorem  6, 
3i(Zo,  Z',  p',  (To).  Since  M'  >*  valof(M,p,  cr),  it  follows  that  M'  =  (fetch  P')  for  some  P' 
and  P'  >*  valof(P,  p,  ct).  By  induction, 

P'  JJ  d'  >*  vaIofcell(Zo,  (Xq). 
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Note  that  (Tq^Iq)  =  susp(/i)  or  rec(/i,/);  we  consider  the  first  case  here  and  leave  the  other 
to  the  reader.  Since  (7o(/o)  is  a  suspension,  it  follows  from  Lemma  20  that  valofcell(/o,  «7o)  = 
(store  Q)  for  some  Q.  Thus,  d'  =  (store  Q')  for  some  Q'.  There  are  now  two  subcases; 

(a)  o-o(/i)  =  thunk(i2,  p').  There  are  two  subcases  depending  on  the  reference  count  of /q- 

i.  refcount(/o,(7o)  =  1-  Then  interp(i2,p',  dec(/i,  dec(/o,  cto)))  =  (/',  cr').  Since  the 
state  S  =  {lo,l',p',(To)  is  regular,  it  follows  that  Iq  is  not  reachable  from  p' — 
otherwise,  there  would  be  an  illegal  cycle  in  the  memory  graph  induced  by  5.  Thus, 

Q  =  valof(E,p',(7o)  =  valof(.R,p',dec(/o,o-o)), 
and  hence  by  induction 

Q'  ik  c>*  valofcell(/^,  a'). 

Thus,  M'  JJ-  c  >*  valofcell(/',  c')  as  desired 

ii.  refcount(/o,CTo)  >  1.  Then  interp(i2,  p',  dec(/i,  dec(/o,  rTo[/o  0])))  =  and 

(/',  a')  -  (/2,  inc(/2,rri[/o  susp(/2)])).  Note  that  because  the  state  S  =  {lo,l',p',  gq) 
is  regular,  neither  Iq  nor  li  is  reachable  from  p' — otherwise,  there  would  be  an  illegal 
cycle  in  the  memory  graph  induced  by  S.  Thus, 

Q  -  valof(iZ,pVo)  =  valof(iE,p',  dec(/i,dec(/o,<Jo[/o  0]))) 

and  hence  by  induction 

1].  c  >*  valofcell(/2,<7i). 

Since  5?(/',  P ,  p',  inc(/2,  cri[/o  susp(/2)]))  holds  by  Theorem  6,  Iq  is  not  accessible 
from  I2.  Thus, 

valofcell(/2,cri)  =  valofcell(/2,  inc(/2,  <7i[/o  >->  susp(/2)]))  =  valofcell(r,  a') 
and  so  M'  JJ-  c  >*  valofcell(/',  cr')  as  desired. 

(b)  cro(/i)  7^  thunk(i2,  p').  This  case  is  straightforward  and  left  to  the  reader. 

6.  M  =  (share  x,y  as  P  in  Q).  Since  M  evaluates, 

interp(P,p|P,c7)  =  ilo,<ro) 

interp(g,(p  jg)[x,y  /o],inc(/o,c7o))  =  {I',  a’) 

Since  M'  >  valof(M,  p,  cr),  it  must  be  the  case  that  M'  =  (share  x,y  as  P'  in  Q')  for  some 
terms  P'  >*  valof(P,  p  |  P,  cr)  and  Q'  >*  valof(g,p|g,cr).  By  induction, 

P'  JJ.  d  >*  valofcel!(/o,  ctq). 

Let  (Ti  =  inc(/o,(To).  By  Lemmas  24  and  25, 

g'  >*  valof(g,p2,a)  >*  valof(g,p2,inc(/o,cro)). 

Hence,  Q'[x,  y  :=  d]  >*  valof(g,  (p  [  Q)[x,  y  Iq],  inc(/o,  ctq)).  By  induction, 

Q'[x,y  :=  d]  JJ.  c  >*  vaiofcell(/',  cr'). 

Thus,  M'  JJ-  c  >*  valofcell(J',  cr')  as  desired. 

7.  M'  =  (dispose  P  before  Q).  Similar  to  the  previous  case  and  hence  omitted. 

This  completes  the  proof  of  the  second  claim  and  hence  the  proof  of  the  theorem.  ■ 
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